Method: Bio::NCBI::REST#esearch

Defined in:
lib/bio/io/ncbirest.rb

#esearch(str, hash = {}, limit = 100, step = 10000) ⇒ Object

Search the NCBI database by given keywords using E-Utils (esearch) service and returns an array of entry IDs.

For information on the possible arguments, see

Usage

ncbi = Bio::NCBI::REST.new
ncbi.esearch("tardigrada", {"db"=>"nucleotide", "rettype"=>"count"})
ncbi.esearch("tardigrada", {"db"=>"nucleotide", "rettype"=>"gb"})
ncbi.esearch("yeast kinase", {"db"=>"nuccore", "rettype"=>"gb", "retmax"=>5})

Bio::NCBI::REST.esearch("tardigrada", {"db"=>"nucleotide", "rettype"=>"count"})
Bio::NCBI::REST.esearch("tardigrada", {"db"=>"nucleotide", "rettype"=>"gb"})
Bio::NCBI::REST.esearch("yeast kinase", {"db"=>"nuccore", "rettype"=>"gb", "retmax"=>5})

Arguments:

  • str: query string (required)

  • hash: hash of E-Utils option => “nuccore”, “rettype” => “gb”

    • db: “sequences”, “nucleotide”, “protein”, “pubmed”, “taxonomy”, …

    • retmode: “text”, “xml”, “html”, …

    • rettype: “gb”, “medline”, “count”, …

    • retmax: integer (default 100)

    • retstart: integer

    • field:

      • “titl”: Title [TI]

      • “tiab”: Title/Abstract [TIAB]

      • “word”: Text words [TW]

      • “auth”: Author [AU]

      • “affl”: Affiliation [AD]

      • “jour”: Journal [TA]

      • “vol”: Volume [VI]

      • “iss”: Issue [IP]

      • “page”: First page [PG]

      • “pdat”: Publication date [DP]

      • “ptyp”: Publication type [PT]

      • “lang”: Language [LA]

      • “mesh”: MeSH term [MH]

      • “majr”: MeSH major topic [MAJR]

      • “subh”: Mesh sub headings [SH]

      • “mhda”: MeSH date [MHDA]

      • “ecno”: EC/RN Number [rn]

      • “si”: Secondary source ID [SI]

      • “uid”: PubMed ID (PMID) [UI]

      • “fltr”: Filter [FILTER] [SB]

      • “subs”: Subset [SB]

    • reldate: 365

    • mindate: 2001

    • maxdate: 2002/01/01

    • datetype: “edat”

  • limit: maximum number of entries to be returned (0 for unlimited)

  • step: maximum number of entries retrieved at a time

Returns

array of entry IDs or a number of results



133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
# File 'lib/bio/io/ncbirest.rb', line 133

def esearch(str, hash = {}, limit = 100, step = 10000)
  serv = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi"
  opts = {
    "tool"   => "bioruby",
    "term"   => str,
  }
  opts.update(hash)

  case opts["rettype"]
  when "count"
    count = esearch_count(str, opts)
    return count
  else
    limit = esearch_count(str, opts) if limit == 0   # unlimit

    list = []
    0.step(limit, step) do |i|
      retmax = [step, limit - i].min
      opts.update("retmax" => retmax, "retstart" => i)
      ncbi_access_wait
      response = Bio::Command.post_form(serv, opts)
      result = response.body
      list += result.scan(/<Id>(.*?)<\/Id>/m).flatten
    end
    return list
  end
end