Class: Arx::Cleaner
- Inherits:
-
Object
- Object
- Arx::Cleaner
- Defined in:
- lib/arx/cleaner.rb
Overview
Class for cleaning strings.
Constant Summary collapse
- URL_PREFIX =
arXiv paper URL prefix format
/^(https?\:\/\/)?(www.)?arxiv\.org\/abs\//
Class Method Summary collapse
-
.clean(string) ⇒ String
Cleans strings.
-
.extract_id(string, version: false) ⇒ String
Attempt to extract an arXiv identifier from a string such as a URL.
-
.extract_version(string) ⇒ String
Attempt to extract a version number from an arXiv identifier.
Class Method Details
.clean(string) ⇒ String
Cleans strings.
17 18 19 |
# File 'lib/arx/cleaner.rb', line 17 def clean(string) string.gsub(/\r\n|\r|\n/, ' ').strip.squeeze ' ' end |
.extract_id(string, version: false) ⇒ String
Attempt to extract an arXiv identifier from a string such as a URL.
26 27 28 29 30 31 32 33 34 35 36 37 38 |
# File 'lib/arx/cleaner.rb', line 26 def extract_id(string, version: false) if version == !!version if string.is_a? String trimmed = /#{URL_PREFIX}.+\/?$/.match?(string) ? string.gsub(/(#{URL_PREFIX})|(\/$)/, '') : string raise ArgumentError.new("Couldn't extract arXiv identifier from: #{string}") unless Validate.id? trimmed version ? trimmed : trimmed.sub(/v[0-9]+$/, '') else raise TypeError.new("Expected `string` to be a String, got: #{string.class}") end else raise TypeError.new("Expected `version` to be boolean (TrueClass or FalseClass), got: #{version.class}") end end |
.extract_version(string) ⇒ String
Attempt to extract a version number from an arXiv identifier.
44 45 46 47 48 49 50 51 52 |
# File 'lib/arx/cleaner.rb', line 44 def extract_version(string) reversed = extract_id(string, version: true).reverse if /^[0-9]+v/.match? reversed reversed.partition('v').first.reverse.to_i else raise ArgumentError.new("Couldn't extract version number from identifier: #{string}") end end |