Class: Datasets::PMJTDatasetList
- Defined in:
- lib/datasets/pmjt-dataset-list.rb
Defined Under Namespace
Classes: Record
Instance Attribute Summary
Attributes inherited from Dataset
Instance Method Summary collapse
- #each(&block) ⇒ Object
-
#initialize ⇒ PMJTDatasetList
constructor
A new instance of PMJTDatasetList.
Methods inherited from Dataset
Constructor Details
#initialize ⇒ PMJTDatasetList
Returns a new instance of PMJTDatasetList.
19 20 21 22 23 24 25 26 27 28 29 30 31 |
# File 'lib/datasets/pmjt-dataset-list.rb', line 19 def initialize super() .id = "pmjt-dataset-list" .name = "List of pre-modern Japanese text dataset" .url = "http://codh.rois.ac.jp/pmjt/" .licenses = ["CC-BY-SA-4.0"] .description = " Pre-Modern Japanese Text, owned by National Institute of Japanese Literature, is released image and text data as open data.\n In addition, some text has description, transcription, and tagging data.\n DESCRIPTION\n\n @data_path = cache_dir_path + (@metadata.id + \".csv\")\nend\n" |
Instance Method Details
#each(&block) ⇒ Object
33 34 35 36 37 38 39 40 41 42 43 44 45 |
# File 'lib/datasets/pmjt-dataset-list.rb', line 33 def each(&block) return to_enum(__method__) unless block_given? latest_version = "201901" url = "http://codh.rois.ac.jp/pmjt/list/pmjt-dataset-list-#{latest_version}.csv" download(@data_path, url) CSV.open(@data_path, headers: :first_row, encoding: "Windows-31J:UTF-8") do |csv| csv.each do |row| record = create_record(row) yield record end end end |