Class: Bio::NCBI::REST
Direct Known Subclasses
Defined Under Namespace
Constant Summary collapse
- NCBI_INTERVAL =
Make no more than one request every 3 seconds.
3
- @@last_access =
nil
Class Method Summary collapse
Instance Method Summary collapse
-
#efetch(ids, hash = {}, step = 100) ⇒ Object
Retrieve database entries by given IDs and using E-Utils (efetch) service.
-
#einfo ⇒ Object
List the NCBI database names E-Utils (einfo) service.
-
#esearch(str, hash = {}, limit = 100, step = 10000) ⇒ Object
Search the NCBI database by given keywords using E-Utils (esearch) service and returns an array of entry IDs.
-
#esearch_count(str, hash = {}) ⇒ Object
- Arguments
- same as esearch method Returns
-
array of entry IDs or a number of results.
Class Method Details
.efetch(*args) ⇒ Object
245 246 247 |
# File 'lib/bio/io/ncbirest.rb', line 245 def self.efetch(*args) self.new.efetch(*args) end |
.einfo ⇒ Object
233 234 235 |
# File 'lib/bio/io/ncbirest.rb', line 233 def self.einfo self.new.einfo end |
.esearch(*args) ⇒ Object
237 238 239 |
# File 'lib/bio/io/ncbirest.rb', line 237 def self.esearch(*args) self.new.esearch(*args) end |
.esearch_count(*args) ⇒ Object
241 242 243 |
# File 'lib/bio/io/ncbirest.rb', line 241 def self.esearch_count(*args) self.new.esearch_count(*args) end |
Instance Method Details
#efetch(ids, hash = {}, step = 100) ⇒ Object
Retrieve database entries by given IDs and using E-Utils (efetch) service.
For information on the possible arguments, see
Usage
ncbi = Bio::NCBI::REST.new
ncbi.efetch("185041", {"db"=>"nucleotide", "rettype"=>"gb", "retmode" => "xml"})
ncbi.efetch("J00231", {"db"=>"nuccore", "rettype"=>"gb", "retmode"=>"xml"})
ncbi.efetch("AAA52805", {"db"=>"protein", "rettype"=>"gb"})
Bio::NCBI::REST.efetch("185041", {"db"=>"nucleotide", "rettype"=>"gb", "retmode" => "xml"})
Bio::NCBI::REST.efetch("J00231", {"db"=>"nuccore", "rettype"=>"gb"})
Bio::NCBI::REST.efetch("AAA52805", {"db"=>"protein", "rettype"=>"gb"})
Arguments:
-
ids: list of NCBI entry IDs (required)
-
hash: hash of E-Utils option => “nuccore”, “rettype” => “gb”
-
db: “sequences”, “nucleotide”, “protein”, “pubmed”, “omim”, …
-
retmode: “text”, “xml”, “html”, …
-
rettype: “gb”, “gbc”, “medline”, “count”,…
-
-
step: maximum number of entries retrieved at a time
- Returns
-
String
205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 |
# File 'lib/bio/io/ncbirest.rb', line 205 def efetch(ids, hash = {}, step = 100) serv = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi" opts = { "tool" => "bioruby", "retmode" => "text", } opts.update(hash) case ids when Array list = ids else list = ids.to_s.split(/\s*,\s*/) end result = "" 0.step(list.size, step) do |i| opts["id"] = list[i, step].join(',') unless opts["id"].empty? ncbi_access_wait response = Bio::Command.post_form(serv, opts) result += response.body end end return result.strip #return result.strip.split(/\n\n+/) end |
#einfo ⇒ Object
List the NCBI database names E-Utils (einfo) service
pubmed protein nucleotide nuccore nucgss nucest structure genome
books cancerchromosomes cdd gap domains gene genomeprj gensat geo
gds homologene journals mesh ncbisearch nlmcatalog omia omim pmc
popset probe proteinclusters pcassay pccompound pcsubstance snp
taxonomy toolkit unigene unists
Usage
ncbi = Bio::NCBI::REST.new
ncbi.einfo
Bio::NCBI::REST.einfo
- Returns
-
array of string (database names)
66 67 68 69 70 71 72 73 |
# File 'lib/bio/io/ncbirest.rb', line 66 def einfo serv = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/einfo.fcgi" opts = {} response = Bio::Command.post_form(serv, opts) result = response.body list = result.scan(/<DbName>(.*?)<\/DbName>/m).flatten return list end |
#esearch(str, hash = {}, limit = 100, step = 10000) ⇒ Object
Search the NCBI database by given keywords using E-Utils (esearch) service and returns an array of entry IDs.
For information on the possible arguments, see
-
eutils.ncbi.nlm.nih.gov/entrez/query/static/esearch_help.html
-
www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=helppubmed.section.pubmedhelp.Search_Field_Descrip
Usage
ncbi = Bio::NCBI::REST.new
ncbi.esearch("tardigrada", {"db"=>"nucleotide", "rettype"=>"count"})
ncbi.esearch("tardigrada", {"db"=>"nucleotide", "rettype"=>"gb"})
ncbi.esearch("yeast kinase", {"db"=>"nuccore", "rettype"=>"gb", "retmax"=>5})
Bio::NCBI::REST.esearch("tardigrada", {"db"=>"nucleotide", "rettype"=>"count"})
Bio::NCBI::REST.esearch("tardigrada", {"db"=>"nucleotide", "rettype"=>"gb"})
Bio::NCBI::REST.esearch("yeast kinase", {"db"=>"nuccore", "rettype"=>"gb", "retmax"=>5})
Arguments:
-
str: query string (required)
-
hash: hash of E-Utils option => “nuccore”, “rettype” => “gb”
-
db: “sequences”, “nucleotide”, “protein”, “pubmed”, “taxonomy”, …
-
retmode: “text”, “xml”, “html”, …
-
rettype: “gb”, “medline”, “count”, …
-
retmax: integer (default 100)
-
retstart: integer
-
field:
-
“titl”: Title [TI]
-
“tiab”: Title/Abstract [TIAB]
-
“word”: Text words [TW]
-
“auth”: Author [AU]
-
“affl”: Affiliation [AD]
-
“jour”: Journal [TA]
-
“vol”: Volume [VI]
-
“iss”: Issue [IP]
-
“page”: First page [PG]
-
“pdat”: Publication date [DP]
-
“ptyp”: Publication type [PT]
-
“lang”: Language [LA]
-
“mesh”: MeSH term [MH]
-
“majr”: MeSH major topic [MAJR]
-
“subh”: Mesh sub headings [SH]
-
“mhda”: MeSH date [MHDA]
-
“ecno”: EC/RN Number [rn]
-
“si”: Secondary source ID [SI]
-
“uid”: PubMed ID (PMID) [UI]
-
“fltr”: Filter [FILTER] [SB]
-
“subs”: Subset [SB]
-
-
reldate: 365
-
mindate: 2001
-
maxdate: 2002/01/01
-
datetype: “edat”
-
-
limit: maximum number of entries to be returned (0 for unlimited)
-
step: maximum number of entries retrieved at a time
- Returns
-
array of entry IDs or a number of results
133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 |
# File 'lib/bio/io/ncbirest.rb', line 133 def esearch(str, hash = {}, limit = 100, step = 10000) serv = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi" opts = { "tool" => "bioruby", "term" => str, } opts.update(hash) case opts["rettype"] when "count" count = esearch_count(str, opts) return count else limit = esearch_count(str, opts) if limit == 0 # unlimit list = [] 0.step(limit, step) do |i| retmax = [step, limit - i].min opts.update("retmax" => retmax, "retstart" => i) ncbi_access_wait response = Bio::Command.post_form(serv, opts) result = response.body list += result.scan(/<Id>(.*?)<\/Id>/m).flatten end return list end end |
#esearch_count(str, hash = {}) ⇒ Object
- Arguments
-
same as esearch method
- Returns
-
array of entry IDs or a number of results
163 164 165 166 167 168 169 170 171 172 173 174 175 176 |
# File 'lib/bio/io/ncbirest.rb', line 163 def esearch_count(str, hash = {}) serv = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi" opts = { "tool" => "bioruby", "term" => str, } opts.update(hash) opts.update("rettype" => "count") #ncbi_access_wait response = Bio::Command.post_form(serv, opts) result = response.body count = result.scan(/<Count>(.*?)<\/Count>/m).flatten.first.to_i return count end |