TOC of the series
As a final step, we are going to refactor a bit the prior code to show other useful Racket features.
First, we can generalize the exported functions to take any input in general and not only strings. This can be useful for clients that, for instance, prefer to pass as argument a text file.
The client then would call find-isbn
or find-isbn-list
as follows (or in a similar way):
(with-input-from-file a-file-name isbn-find/list)
but the client could still pass strings with
(with-input-from-string a-string isbn-find/list)
Those with-input
functions are provided by racket/port
.
For more information about them see Input and Output [The Racket Guide], as well as the documentation for the functions mentioned.
Another commonly used feature is defining functions with optional arguments and/or with named (aka. keyword) arguments.
In the current refactoring the find
functions can receive one or two optional arguments, one of them named. Note that contracts for this kind of functions are special: they use the construct ->*
.
For more information about function definitions with optional or keyword arguments see Functions (Procedures) [The Racket Guide].
Finally, as a cosmetic touch, I have refactored the two last helpers isbn-match*
and isbn-normalize
to show a programming style called point-free style that you may find sometimes in many functional programming languages. In this style arguments are implicit and everything on the surface is just functions and function composition. To write in this way we need curry
and curryr
from racket/function
as well as compose
or compose1
, a variant of compose
for functions returning single values.
For more information about point-free style the Haskell wiki page on Pointfree should be helpful. For information about curry
and other functional stuff, see Procedures [The Racket Reference]
#lang racket/base (require racket/contract) ; ---------------------------------------------------------- (provide (contract-out ; extracts all isbns from the given input [isbn-find/list (->* () (input-port?) (listof isbn?))] ; extracts the first isbn (of given format) from given input, if any ; default format: 'isbn-13 [isbn-find (->* () (input-port? #:format isbn-format?) (or/c isbn? #f))] ; determines whether the given value is a valid isbn [isbn? predicate/c] ; determines whether the give value is an isbn format [isbn-format? predicate/c])) ; ---------------------------------------------------------- (require (only-in racket/function curry curryr)) (require (only-in racket/match match)) (require (only-in racket/sequence sequence/c)) (require (only-in racket/string string-normalize-spaces string-replace)) (module+ test (require rackunit)) ; ---------------------------------------------------------- ; Patterns and Regexes ; [See ISBN International User Manual 7e. Sect. 5] ; - Pattern Components (define pat-isbn-sep "[ -]") (define pat-isbn-id "ISBN(-1[03])?:? ") (define pat-isbn-13-id "(?:ISBN(?:-13)?:? )?") (define pat-isbn-10-id "(?:ISBN(?:-10)?:? )?") (define pat-isbn-prefix "97[89][ -]") (define pat-isbn-registration "\\d{1,5}[ -]") (define pat-isbn-registrant "\\d{1,7}[ -]") (define pat-isbn-publication "\\d{1,6}[ -]") (define pat-isbn-13-check "\\d") (define pat-isbn-10-check "[X\\d]") ; - Look ahead to ISBN groups (define pat-isbn-13-look-ahead (string-append "(?=" pat-isbn-prefix pat-isbn-registration ")")) (define pat-isbn-10-look-ahead (string-append "(?=" pat-isbn-registration ")")) ; - Main patterns (define pat-isbn-13/groups (string-append pat-isbn-13-id pat-isbn-13-look-ahead pat-isbn-prefix pat-isbn-registration pat-isbn-registrant pat-isbn-publication pat-isbn-13-check)) (define pat-isbn-10/groups (string-append pat-isbn-10-id pat-isbn-10-look-ahead pat-isbn-registration pat-isbn-registrant pat-isbn-publication pat-isbn-10-check)) (define pat-isbn-13/prefix (string-append pat-isbn-13-id pat-isbn-prefix "\\d{10}")) (define pat-isbn-13-norm "\\d{13}") (define pat-isbn-10-norm "\\d{9}[X\\d]") (define pat-isbn-13 (string-append pat-isbn-13-norm "|" pat-isbn-13/prefix "|" pat-isbn-13/groups)) (define pat-isbn-10 (string-append pat-isbn-10-norm "|" pat-isbn-10/groups)) (define pat-isbn (string-append pat-isbn-13 "|" pat-isbn-10)) ; - Regexes (define re-isbn-id (regexp pat-isbn-id)) (define re-isbn-sep (regexp pat-isbn-sep)) (define re-isbn-13 (pregexp pat-isbn-13)) (define re-isbn-10 (pregexp pat-isbn-10)) (define re-isbn-13-norm (pregexp pat-isbn-13-norm)) (define re-isbn-10-norm (pregexp pat-isbn-10-norm)) (define re-isbn (pregexp pat-isbn)) ; ---------------------------------------------------------- ; Predicates (module+ test (check-true (isbn? "9781593274917")) (check-true (isbn? "0262062186")) (check-false (isbn? #f)) (check-true (isbn-13? "9781593274917")) (check-false (isbn-13? "0262062186")) (check-false (isbn-13? "")) (check-false (isbn-10? "9781593274917")) (check-true (isbn-10? "0262062186")) (check-false (isbn-10? 1)) (check-true (isbn-string? "9781593274912")) (check-true (isbn-string? "026206218X")) (check-false (isbn-string? "97815932749122")) ;too long (check-false (isbn-string? "978159327491")) ;too short (check-false (isbn-string? "0262062189X")) ;too long (check-false (isbn-string? "026206218")) ;too short (check-false (isbn-string? "0-262-06218-6")) (check-true (isbn-13-string? "9781593274912")) (check-false (isbn-13-string? "97815932749122")) ;too long (check-false (isbn-13-string? "978159327491")) ;too short (check-false (isbn-13-string? #f)) (check-true (isbn-10-string? "026206218X")) (check-false (isbn-10-string? "0262062189X")) ;too long (check-false (isbn-10-string? "026206218")) ;too short (check-false (isbn-10-string? #f)) (check-true (isbn-format? 'isbn-13)) (check-true (isbn-format? 'isbn-10)) (check-false (isbn-format? "isbn-10"))) (define (isbn? v) (or (isbn-13? v) (isbn-10? v))) (define (isbn-13? v) (and (isbn-13-string? v) (isbn-13-valid? v))) (define (isbn-10? v) (and (isbn-10-string? v) (isbn-10-valid? v))) (define (isbn-string? v) (or (isbn-13-string? v) (isbn-10-string? v))) (define (isbn-13-string? v) (and (string? v) (regexp-match-exact? re-isbn-13-norm v))) (define (isbn-10-string? v) (and (string? v) (regexp-match-exact? re-isbn-10-norm v))) (define (isbn-format? v) (or (equal? v 'isbn-13) (equal? v 'isbn-10))) ; ---------------------------------------------------------- ; ISBN Validation ; ISBN-13-String -> Boolean ; is the given isbn-13 string a valid isbn-13 (module+ test (check-true (isbn-13-valid? "9781593274917")) (check-true (isbn-13-valid? "9780201896831")) (check-false (isbn-13-valid? "9781593274912")) (check-false (isbn-13-valid? "9780201896834"))) (define/contract (isbn-13-valid? isbn-str) (-> isbn-13-string? boolean?) (isbn-checksumf (in-list (isbn-string->numbers isbn-str)) (in-cycle '(1 3)) 10)) ; ISBN-10-String -> Boolean ; is the given isbn-10 string a valid isbn-10 (module+ test (check-true (isbn-10-valid? "0262062186")) (check-true (isbn-10-valid? "026256114X")) (check-false (isbn-10-valid? "026206218X")) (check-false (isbn-10-valid? "0262561141"))) (define/contract (isbn-10-valid? isbn-str) (-> isbn-10-string? boolean?) (isbn-checksumf (in-list (isbn-string->numbers isbn-str)) (in-range 10 0 -1) 11)) ; [Sequence-of N] [Sequence-of N] N -> Boolean ; abstract checksum algorithm for isbn validation (module+ test (check-true (isbn-checksumf '(9 7 8 1 5 9 3 2 7 4 9 1 7) (in-cycle '(1 3)) 10)) (check-false (isbn-checksumf '(0 2 6 2 0 6 2 1 8 10) (in-range 10 0 -1) 11))) (define/contract (isbn-checksumf multiplicands multipliers mod) (-> (sequence/c natural-number/c) (sequence/c natural-number/c) natural-number/c boolean?) (define sum (for/sum ([x multiplicands] [y multipliers]) (* x y))) (zero? (modulo sum mod))) ; ISBN-String -> [List-of N] ; translates str into the numbers their isbn letters represent (module+ test (check-equal? (isbn-string->numbers "026256114X") '(0 2 6 2 5 6 1 1 4 10))) (define/contract (isbn-string->numbers str) (-> isbn-string? (listof natural-number/c)) (for/list ([char (in-string str)]) (match char [#\X 10] [_ (string->number (string char))]))) ; ---------------------------------------------------------- ; ISBN Extraction ; [Input-Port (current-input-port)] -> [List-of ISBN] ; extracts all isbns from in (module+ test (require racket/port) (check-equal? (with-input-from-string "" isbn-find/list) '()) (check-equal? (with-input-from-string "none" isbn-find/list) '()) (check-equal? (with-input-from-file "test-isbn-examples" isbn-find/list) (list ;isbn normalized "0262062186" "026256114X" "1593274912" "9781593274917" "0201896834" "9780201896831" ;isbn w/ several id's a sep's "0262062186" "026256114X" "0262062186" "0201896834" "9780201896831" "026256114X" "9780201896831"))) (define (isbn-find/list [in (current-input-port)]) (for*/list ([line (in-lines in)] [candidate (in-list (isbn-match* line))] [isbn-str (in-value (isbn-normalize candidate))] #:when (isbn? isbn-str)) isbn-str)) ; [Input-Port (current-input-port)] [ISBN-Format 'isbn-13] ; -> [Maybe ISBN] ; extracts the first isbn of given format from in, if any ; default format: 'isbn-13 (module+ test (check-false (with-input-from-string "" isbn-find)) (check-false (with-input-from-string "0262062186" isbn-find)) (check-false (with-input-from-string "0262062186" (curry isbn-find #:format 'isbn-13))) (check-false (with-input-from-string "9781593274917" (curry isbn-find #:format 'isbn-10))) (check-equal? (with-input-from-file "test-isbn-examples" (curry isbn-find #:format 'isbn-10)) "0262062186") (check-equal? (with-input-from-file "test-isbn-examples" (curry isbn-find #:format 'isbn-13)) "9781593274917")) (define (isbn-find [in (current-input-port)] #:format [isbn-format 'isbn-13]) (define p? (match isbn-format ['isbn-13 isbn-13?] ['isbn-10 isbn-10?])) (for*/or ([line (in-lines in)] [candidate (in-list (isbn-match* line))] [isbn-str (in-value (isbn-normalize candidate))] #:when (p? isbn-str)) isbn-str)) ; ---------------------------------------------------------- ; Helpers ; String -> [List-of String] ; matches substrings in the given string looking like isbn tags (module+ test (require racket/file) (check-equal? (isbn-match* "") '()) (check-equal? (isbn-match* "abc\nd") '()) (check-equal? (isbn-match* (file->string "test-isbn-examples")) (list ;isbn normalized (all matched) "0262062186" "026256114X" "1593274912" "9781593274917" "0201896834" "9780201896831" ;isbn w/ several id's and sep's (all matched) "ISBN 0-262-06218-6" "ISBN: 0 262 56114 X" "ISBN-10 0 262 06218-6" "ISBN-10: 0-201-89683-4" "ISBN-13: 978-0-201-89683-1" "ISBN-10: 0 262 56114 X" "ISBN-13: 978-0201896831" ;not isbn strings (usually impossible in real-world) "026206218X" "0262561141" "9780201896834" ;partially matched ;isbn strings, but isbn invalid (all matched) "026206218X" "0262561141" "1593274913" "9781593274912" "0201896833" "9780201896834"))) (define/contract isbn-match* (-> string? (listof string?)) (compose1 (curry regexp-match* re-isbn) string-normalize-spaces)) ; String -> String ; removes the isbn-id and, then, the isbn separators from str (module+ test (check-equal? (isbn-normalize "123-45 67") "1234567") (check-equal? (isbn-normalize "ISBN 123") "123") (check-equal? (isbn-normalize "ISBN: 123") "123") (check-equal? (isbn-normalize "ISBN-10 123") "123") (check-equal? (isbn-normalize "ISBN-10: 123") "123") (check-equal? (isbn-normalize "ISBN-13 123") "123") (check-equal? (isbn-normalize "ISBN-13: 123") "123") (check-equal? (isbn-normalize "ISBN-14: hi") "ISBN14:hi")) (define/contract isbn-normalize (-> string? string?) (compose1 (curryr string-replace re-isbn-sep "") (curryr string-replace re-isbn-id "")))
Next article in the series: Racket: exception handling, http requests, JSON
No hay comentarios:
Publicar un comentario