miércoles, 17 de abril de 2019

From HtDP to Racket. Racket (5): I/0, optional and keyword args, point-free style

As a final step, we are going to refactor a bit the prior code to show other useful Racket features.

First, we can generalize the exported functions to take any input in general and not only strings. This can be useful for clients that, for instance, prefer to pass as argument a text file.

The client then would call find-isbn or find-isbn-list as follows (or in a similar way):

(with-input-from-file a-file-name isbn-find/list)

but the client could still pass strings with

(with-input-from-string a-string isbn-find/list)

Those with-input functions are provided by racket/port.

For more information about them see Input and Output [The Racket Guide], as well as the documentation for the functions mentioned.

Another commonly used feature is defining functions with optional arguments and/or with named (aka. keyword) arguments.

In the current refactoring the find functions can receive one or two optional arguments, one of them named. Note that contracts for this kind of functions are special: they use the construct ->*.

For more information about function definitions with optional or keyword arguments see Functions (Procedures) [The Racket Guide].

Finally, as a cosmetic touch, I have refactored the two last helpers isbn-match* and isbn-normalize to show a programming style called point-free style that you may find sometimes in many functional programming languages. In this style arguments are implicit and everything on the surface is just functions and function composition. To write in this way we need curry and curryr from racket/function as well as compose or compose1, a variant of compose for functions returning single values.

For more information about point-free style the Haskell wiki page on Pointfree should be helpful. For information about curry and other functional stuff, see Procedures [The Racket Reference]


#lang racket/base

(require racket/contract)

; ----------------------------------------------------------

(provide
 (contract-out
  ; extracts all isbns from the given input
  [isbn-find/list (->* () (input-port?) (listof isbn?))]
  
  ; extracts the first isbn (of given format) from given input, if any
  ; default format: 'isbn-13
  [isbn-find      (->* () (input-port? #:format isbn-format?)
                       (or/c isbn? #f))]
  
  ; determines whether the given value is a valid isbn
  [isbn?          predicate/c]

  ; determines whether the give value is an isbn format
  [isbn-format?   predicate/c]))

; ----------------------------------------------------------

(require (only-in racket/function
                  curry
                  curryr))
(require (only-in racket/match
                  match))
(require (only-in racket/sequence
                  sequence/c))
(require (only-in racket/string
                  string-normalize-spaces
                  string-replace))

(module+ test
  (require rackunit))

; ----------------------------------------------------------
; Patterns and Regexes
; [See ISBN International User Manual 7e. Sect. 5]

; - Pattern Components
(define pat-isbn-sep "[ -]")
(define pat-isbn-id "ISBN(-1[03])?:? ")
(define pat-isbn-13-id "(?:ISBN(?:-13)?:? )?")
(define pat-isbn-10-id "(?:ISBN(?:-10)?:? )?")
(define pat-isbn-prefix "97[89][ -]")
(define pat-isbn-registration "\\d{1,5}[ -]")
(define pat-isbn-registrant "\\d{1,7}[ -]")
(define pat-isbn-publication "\\d{1,6}[ -]")
(define pat-isbn-13-check "\\d")
(define pat-isbn-10-check "[X\\d]")

; - Look ahead to ISBN groups
(define pat-isbn-13-look-ahead
  (string-append "(?=" pat-isbn-prefix pat-isbn-registration ")"))

(define pat-isbn-10-look-ahead
  (string-append "(?=" pat-isbn-registration ")")) 

; - Main patterns
(define pat-isbn-13/groups
  (string-append pat-isbn-13-id
                 pat-isbn-13-look-ahead
                 pat-isbn-prefix
                 pat-isbn-registration
                 pat-isbn-registrant
                 pat-isbn-publication
                 pat-isbn-13-check))

(define pat-isbn-10/groups
  (string-append pat-isbn-10-id
                 pat-isbn-10-look-ahead
                 pat-isbn-registration
                 pat-isbn-registrant
                 pat-isbn-publication
                 pat-isbn-10-check))

(define pat-isbn-13/prefix
  (string-append pat-isbn-13-id pat-isbn-prefix "\\d{10}"))

(define pat-isbn-13-norm "\\d{13}")

(define pat-isbn-10-norm "\\d{9}[X\\d]")

(define pat-isbn-13
  (string-append pat-isbn-13-norm "|"
                 pat-isbn-13/prefix "|"
                 pat-isbn-13/groups))

(define pat-isbn-10
  (string-append pat-isbn-10-norm "|"
                 pat-isbn-10/groups))

(define pat-isbn
  (string-append pat-isbn-13 "|"
                 pat-isbn-10))

; - Regexes
(define re-isbn-id (regexp pat-isbn-id))
(define re-isbn-sep (regexp pat-isbn-sep))
(define re-isbn-13 (pregexp pat-isbn-13))
(define re-isbn-10 (pregexp pat-isbn-10))
(define re-isbn-13-norm (pregexp pat-isbn-13-norm))
(define re-isbn-10-norm (pregexp pat-isbn-10-norm))
(define re-isbn (pregexp pat-isbn))

; ----------------------------------------------------------
; Predicates

(module+ test
  (check-true (isbn? "9781593274917"))
  (check-true (isbn? "0262062186"))
  (check-false (isbn? #f))
  
  (check-true (isbn-13? "9781593274917"))
  (check-false (isbn-13? "0262062186"))
  (check-false (isbn-13? ""))
  
  (check-false (isbn-10? "9781593274917"))
  (check-true (isbn-10? "0262062186"))
  (check-false (isbn-10? 1))
  
  (check-true (isbn-string? "9781593274912"))
  (check-true (isbn-string? "026206218X"))
  (check-false (isbn-string? "97815932749122")) ;too long
  (check-false (isbn-string? "978159327491"))   ;too short
  (check-false (isbn-string? "0262062189X"))    ;too long
  (check-false (isbn-string? "026206218"))      ;too short
  (check-false (isbn-string? "0-262-06218-6"))
  
  (check-true (isbn-13-string? "9781593274912"))
  (check-false (isbn-13-string? "97815932749122")) ;too long
  (check-false (isbn-13-string? "978159327491"))   ;too short
  (check-false (isbn-13-string? #f))
  
  (check-true (isbn-10-string? "026206218X"))
  (check-false (isbn-10-string? "0262062189X")) ;too long
  (check-false (isbn-10-string? "026206218"))   ;too short
  (check-false (isbn-10-string? #f))
  
  (check-true (isbn-format? 'isbn-13))
  (check-true (isbn-format? 'isbn-10))
  (check-false (isbn-format? "isbn-10")))
  
(define (isbn? v)
  (or (isbn-13? v) (isbn-10? v)))

(define (isbn-13? v)
  (and (isbn-13-string? v) (isbn-13-valid? v)))

(define (isbn-10? v)
  (and (isbn-10-string? v) (isbn-10-valid? v)))

(define (isbn-string? v)
  (or (isbn-13-string? v) (isbn-10-string? v)))

(define (isbn-13-string? v)
  (and (string? v) (regexp-match-exact? re-isbn-13-norm v)))

(define (isbn-10-string? v)
  (and (string? v) (regexp-match-exact? re-isbn-10-norm v)))

(define (isbn-format? v)
  (or (equal? v 'isbn-13) (equal? v 'isbn-10)))

; ----------------------------------------------------------
; ISBN Validation

; ISBN-13-String -> Boolean
; is the given isbn-13 string a valid isbn-13

(module+ test
  (check-true (isbn-13-valid? "9781593274917"))
  (check-true (isbn-13-valid? "9780201896831"))
  (check-false (isbn-13-valid? "9781593274912"))
  (check-false (isbn-13-valid? "9780201896834")))

(define/contract (isbn-13-valid? isbn-str)
  (-> isbn-13-string? boolean?)
  (isbn-checksumf (in-list (isbn-string->numbers isbn-str))
                  (in-cycle '(1 3))
                  10))

; ISBN-10-String -> Boolean
; is the given isbn-10 string a valid isbn-10

(module+ test
  (check-true (isbn-10-valid? "0262062186"))
  (check-true (isbn-10-valid? "026256114X"))
  (check-false (isbn-10-valid? "026206218X"))
  (check-false (isbn-10-valid? "0262561141")))

(define/contract (isbn-10-valid? isbn-str)
  (-> isbn-10-string? boolean?)
  (isbn-checksumf (in-list (isbn-string->numbers isbn-str))
                  (in-range 10 0 -1)
                  11))

; [Sequence-of N] [Sequence-of N] N -> Boolean
; abstract checksum algorithm for isbn validation

(module+ test
  (check-true (isbn-checksumf '(9 7 8 1 5 9 3 2 7 4 9 1 7)
                              (in-cycle '(1 3))
                              10))
  (check-false (isbn-checksumf '(0 2 6 2 0 6 2 1 8 10)
                               (in-range 10 0 -1)
                               11)))

(define/contract (isbn-checksumf multiplicands multipliers mod)
  (-> (sequence/c natural-number/c)
      (sequence/c natural-number/c)
      natural-number/c
      boolean?)
  (define sum
    (for/sum ([x multiplicands]
              [y multipliers])
      (* x y)))
  (zero? (modulo sum mod)))

; ISBN-String -> [List-of N]
; translates str into the numbers their isbn letters represent

(module+ test
  (check-equal? (isbn-string->numbers "026256114X")
                '(0 2 6 2 5 6 1 1 4 10)))

(define/contract (isbn-string->numbers str)
  (-> isbn-string? (listof natural-number/c))
  (for/list ([char (in-string str)])
    (match char
      [#\X 10]
      [_ (string->number (string char))])))

; ----------------------------------------------------------
; ISBN Extraction

; [Input-Port (current-input-port)] -> [List-of ISBN]
; extracts all isbns from in

(module+ test
  (require racket/port)
  
  (check-equal? (with-input-from-string "" isbn-find/list) '())
  (check-equal? (with-input-from-string "none" isbn-find/list) '())
  (check-equal?
   (with-input-from-file "test-isbn-examples" isbn-find/list)
   (list
    ;isbn normalized
    "0262062186" "026256114X" "1593274912"
    "9781593274917" "0201896834" "9780201896831"
    ;isbn w/ several id's a sep's
    "0262062186" "026256114X" "0262062186" "0201896834"
    "9780201896831" "026256114X" "9780201896831")))

(define (isbn-find/list [in (current-input-port)])
  (for*/list ([line (in-lines in)]
              [candidate (in-list (isbn-match* line))]
              [isbn-str (in-value (isbn-normalize candidate))]
              #:when (isbn? isbn-str))
    isbn-str))

; [Input-Port (current-input-port)] [ISBN-Format 'isbn-13]
; -> [Maybe ISBN]
; extracts the first isbn of given format from in, if any
; default format: 'isbn-13

(module+ test
  (check-false (with-input-from-string "" isbn-find))
  (check-false (with-input-from-string "0262062186" isbn-find))
  (check-false (with-input-from-string "0262062186"
                 (curry isbn-find #:format 'isbn-13)))
  (check-false (with-input-from-string "9781593274917"
                 (curry isbn-find #:format 'isbn-10)))
  (check-equal? (with-input-from-file "test-isbn-examples"
                  (curry isbn-find #:format 'isbn-10))
                "0262062186")
  (check-equal? (with-input-from-file "test-isbn-examples"
                  (curry isbn-find #:format 'isbn-13))
                "9781593274917"))

(define (isbn-find [in (current-input-port)]
                   #:format [isbn-format 'isbn-13])
  (define p?
    (match isbn-format
      ['isbn-13 isbn-13?]
      ['isbn-10 isbn-10?]))
  (for*/or ([line (in-lines in)]
            [candidate (in-list (isbn-match* line))]
            [isbn-str (in-value (isbn-normalize candidate))]
            #:when (p? isbn-str))
    isbn-str))

; ----------------------------------------------------------
; Helpers

; String -> [List-of String]
; matches substrings in the given string looking like isbn tags

(module+ test
  (require racket/file)
  
  (check-equal? (isbn-match* "") '())
  (check-equal? (isbn-match* "abc\nd") '())
  (check-equal?
   (isbn-match* (file->string "test-isbn-examples"))
   (list
    ;isbn normalized (all matched)
    "0262062186" "026256114X" "1593274912"
    "9781593274917" "0201896834" "9780201896831"
    ;isbn w/ several id's and sep's (all matched)
    "ISBN 0-262-06218-6"
    "ISBN: 0 262 56114 X"
    "ISBN-10 0 262 06218-6"
    "ISBN-10: 0-201-89683-4"
    "ISBN-13: 978-0-201-89683-1"
    "ISBN-10: 0 262 56114 X"
    "ISBN-13: 978-0201896831"
    ;not isbn strings (usually impossible in real-world)
    "026206218X" "0262561141" "9780201896834" ;partially matched
    ;isbn strings, but isbn invalid (all matched)
    "026206218X" "0262561141" "1593274913"
    "9781593274912" "0201896833" "9780201896834")))

(define/contract isbn-match*
  (-> string? (listof string?))
  (compose1 (curry regexp-match* re-isbn)
            string-normalize-spaces))

; String -> String
; removes the isbn-id and, then, the isbn separators from str

(module+ test
  (check-equal? (isbn-normalize "123-45 67") "1234567")
  (check-equal? (isbn-normalize "ISBN 123") "123")
  (check-equal? (isbn-normalize "ISBN: 123") "123")
  (check-equal? (isbn-normalize "ISBN-10 123") "123")
  (check-equal? (isbn-normalize "ISBN-10: 123") "123")
  (check-equal? (isbn-normalize "ISBN-13 123") "123")
  (check-equal? (isbn-normalize "ISBN-13: 123") "123")
  (check-equal? (isbn-normalize "ISBN-14: hi") "ISBN14:hi"))

(define/contract isbn-normalize
  (-> string? string?)
  (compose1 (curryr string-replace re-isbn-sep "")
            (curryr string-replace re-isbn-id "")))

No hay comentarios:

Publicar un comentario