miércoles, 17 de abril de 2019

From HtDP to Racket. Racket (6): exceptions, http requests, json

I finish this series with a couple of new modules (each on its own file) that minimally accomplish the rest of the sub-tasks mentioned in the first post of the series.

Let's recall those tasks this time with a reference to the responsible module.

  1. Read the text contained in the pdf pages
  2. Search for the ISBN on those pages
  3. Make a query for the ISBN to a remote server that provides information about books.
  4. Parse the response of the server to represent it as a Racket data type.

The first and second tasks above are carried out by "pdf-read.rkt" with the aid of "isbn.rkt".

The third and forth tasks, in turn, are solved by "book-info.rkt".

"pdf-isbn" does the same thing as "isbn" but over pdf files. It uses the latter along with the pdf-read package, a Racket interface to the popular libpoppler library, available by default on Linux and, to my knowledge, on MacOS systems. You probably need to install pdf-read before running the code. Don't worry, installing packages from DrRacket is very easy and self-explanatory.

As for relevant Racket constructs used in "pdf-read" it is worth to mention error, that raises an exception when the pdf file doesn't exist. For testing exceptions you should use check-exn from rackunit.

Another interesting divergence between Racket and *SL languages has to do with or and conditionals in general. In Racket everything there except#false is treated as #true. This fact allows us to use something like:

(or (extract/format 'isbn-13) (extract/format 'isbn-10)))

The module "book-info" requests information to a remote provider (via functions belonging to the net/url package) to get bibliographic information about a book and parses that information in two phases: 1. parses the JSON response via read-json (from the json package), and 2. parses the output of read-json into the structure book-info. For the second phase I have resorted to the package json-pointer that you may need to install on your system.

Note that only the Open Library provider is supported. Other providers like Google Books, The Library of Congress, etc. could be also supported in a similar way.

A few new frequently used Racket constructs applied by "book-info" are define-values, which allows to bind multiple identifiers at once, and with-handlers, that allows to handle an exception as wished.


; ----------------------------------------------------------
; pdf-isbn.rkt
; ----------------------------------------------------------

#lang racket/base

(require racket/contract)

; ----------------------------------------------------------

(provide
 (contract-out
  ; extracts all isbn's from the given pdf document
  ; raises exception when the pdf file does not exist
  [extract-isbn-from-pdf/list (-> pdf-document? (listof isbn?))]

  ; extracts the first isbn from the given pdf document, if any
  ; favors isbn-13 over isbn-10
  ; raises exception when the pdf file does not exist
  [extract-isbn-from-pdf      (-> pdf-document? (or/c isbn? #f))]))

; ----------------------------------------------------------

(require (only-in racket/function
                  curry))
(require (only-in racket/port
                  call-with-input-string))
(require pdf-read)
(require "isbn.rkt")

(module+ test
  (require rackunit)
  (require (only-in racket/function
                    thunk))

  ; examples for tests
  (define pdf-with-isbn "test-isbn-examples.pdf")
  (define pdf-with-isbn-10-only "test-with-isbn-10-only.pdf")
  (define pdf-without-isbn "test-without-isbn.pdf"))

; ----------------------------------------------------------

; PDF-Document -> [List-of ISBN]

(module+ test
  (check-exn exn:fail? (thunk (extract-isbn-from-pdf "hi.pdf")))
  (check-equal?
   (extract-isbn-from-pdf/list pdf-without-isbn)
   '())
  
  (check-equal?
   (extract-isbn-from-pdf/list pdf-with-isbn)
   (list "0262062186" "026256114X" "1593274912"
         "9781593274917" "0201896834" "9780201896831"
         "0262062186" "026256114X" "0262062186" "0201896834"
         "9780201896831" "026256114X" "9780201896831")))

(define (extract-isbn-from-pdf/list f)
  (check-input f 'extract-isbn-from-pdf/list)  
  (for/fold ([isbns '()])
            ([pg# (in-range (pdf-count-pages f))])
    (append isbns
            (call-with-input-string (page-text (pdf-page f pg#))
                                    isbn-find/list))))

; PDF-Document -> [Maybe ISBN]

(module+ test
  (check-exn exn:fail? (thunk (extract-isbn-from-pdf "hi.pdf")))
  (check-equal?
   (extract-isbn-from-pdf pdf-with-isbn)
   "9781593274917")
  (check-equal?
   (extract-isbn-from-pdf pdf-with-isbn-10-only)
   "0262062186")
  (check-false
   (extract-isbn-from-pdf pdf-without-isbn)))

(define (extract-isbn-from-pdf f)
  (check-input f 'extract-isbn-from-pdf)  
  (define (extract/format format)
    (for/or ([pg# (in-range (pdf-count-pages f))])
      (call-with-input-string (page-text (pdf-page f pg#))
                              (curry isbn-find #:format format))))
  
  (or (extract/format 'isbn-13) (extract/format 'isbn-10)))

; PDF-Document Symbol -> PDF-Document
; effect: report error for src if f doesn't exist

(module+ test
  (check-exn exn:fail? (thunk (check-input "not-avail.pdf"))) 
  (check-pred file-exists? (check-input pdf-with-isbn 'fun)))

(define/contract (check-input f src)
  (-> pdf-document? symbol? pdf-document?)
  (unless (file-exists? f)
    (error src "~s not found" f))
  f)


; ----------------------------------------------------------
; book-info.rkt
; ----------------------------------------------------------

#lang racket/base

(require racket/contract)

; ----------------------------------------------------------

(provide
 (contract-out
  ; record of book information
  [struct book-info    ([isbn isbn?]
                        [authors (listof string?)]
                        [date string?]
                        [title string?]
                        [places (listof string?)]
                        [publishers (listof string?)])]

  ; gets information about a book of given isbn from given provider 
  ; produces a book-info with only the isbn filled if no info avail
  [book-retrieve-info  (-> isbn? provider? book-info?)]

  ; determines whether the given is a provider of book information
  [provider?           predicate/c]))

; ----------------------------------------------------------

(require (only-in racket/function
                  curry))
(require (only-in racket/match
                  match))
(require (only-in racket/string
                  string-replace))
         
(require (only-in json
                  read-json
                  jsexpr?))
(require (only-in json-pointer
                  json-pointer?
                  json-pointer-expression?
                  json-pointer-value))
(require (only-in net/url
                  call/input-url
                  get-pure-port
                  string->url
                  url?))

(require (only-in "isbn.rkt"
                  isbn?))

(module+ test
  (require rackunit)
  (require (only-in racket/function
                    thunk)))

; ----------------------------------------------------------
; Data Types

(struct book-info [isbn authors date title places publishers]
  #:transparent)

(define book-info-template
  (book-info "" '() "" "" '() '()))

; Providers:
; ol: openlibrary
; gb: google-books (not implemented)
; ...
(define (provider? v)
  (or (equal? v 'ol) (equal? v 'gb)))

; ----------------------------------------------------------
; Book Info Builder

; ISBN Provider -> Book-Info

; TODO: test

(define (book-retrieve-info isbn provider)
  (define-values (request reader parser)
    (match provider
      ['ol (values (book-query isbn uri-ol)
                   read-json
                   (curry parse/ol isbn))]
      [_ (error "Not implemented")]))
  
  (call/input-url request
                  get-pure-port
                  (compose1 parser reader)))

; ISBN String -> Url
; produces the url from given template and isbn

(module+ test
  (check-equal?
   (book-query "0262062186" "http://example.com?id:$$isbn$$")
   (string->url "http://example.com?id:0262062186")))

(define/contract (book-query isbn url)
  (-> isbn? string? url?)
  (string->url (string-replace url "$$isbn$$" isbn)))

; ----------------------------------------------------------
; Providers

; - Open Library (ol)

; query template
(define uri-ol
  (string-append
   "http://openlibrary.org/api/books?bibkeys=ISBN:$$isbn$$"
   "&format=json"
   "&jscmd=data"))

; parser

; ISBN JSExpr -> Book-Info
; parses the given jsexpr to get book info from OL
; produces a book info with only the isbn filled when OL knows
; nothing about it

; TODO: test

(define/contract (parse/ol isbn jsexpr)
  (-> isbn? jsexpr? book-info?)
  (define (build-book-info)
    (define base-point
      (symbol->string (hash-iterate-key jsexpr 0)))
    (define authors
      (json-pointer-value/index (cons base-point '("authors"))
                                '("name")
                                jsexpr))
    (define date
      (json-pointer-value (cons base-point '("publish_date"))
                          jsexpr))
    (define title
      (json-pointer-value (cons base-point '("title"))
                          jsexpr))
    (define places
      (json-pointer-value/index (cons base-point '("publish_places"))
                                '("name")
                                jsexpr))
    (define publishers
      (json-pointer-value/index (cons base-point '("publishers"))
                                '("name")
                                jsexpr))
    (book-info isbn authors date title places publishers))
  
  (define (build-book-info/not-avail)
    (struct-copy book-info book-info-template [isbn isbn]))
    
  (match jsexpr
    [(? hash-empty?) (build-book-info/not-avail)]
    [_ (build-book-info)]))
    
; ----------------------------------------------------------
; Helpers (extending json-pointer)

; [JSON-Pointer | JSON-Pointer-Expr] JSExpr -> JSExpr
; wrapper to produce #f when json-pointer raises an exception

(module+ test
  (check-equal?
   (json-pointer-value/false "/a" (hash 'a 1))
   (json-pointer-value "/a" (hash 'a 1)))
  (check-false
   (json-pointer-value/false "/a" (hash 'b 2))))

(define/contract (json-pointer-value/false jp jsexpr)
  (-> (or/c json-pointer? json-pointer-expression?)
      jsexpr?
      jsexpr?)
  (with-handlers ([exn:fail? (lambda (e) #f)])
    (json-pointer-value jp jsexpr)))

; JSON-Pointer-Expr JSON-Pointer-Expr JSExpr -> JSExpr
; gets all the values at /pre/n/post in jsexpr for all legal n's

(module+ test
  ;json example
  (define jse-ex
    '#hasheq((a
              .
              #hasheq((b
                       .
                       (#hasheq((x . 1) (y . 2))
                        #hasheq((x . 3) (y . 4))))
                      (c . (#hasheq((y . 0))))))))
  
  (check-equal?
   (json-pointer-value/index '("a" "b") '("y") jse-ex)
   '(2 4))
  (check-equal?
   (json-pointer-value/index '("a" "c") '("y") jse-ex)
   '(0)))

(define/contract (json-pointer-value/index pre post jsexpr)
  (-> (or/c json-pointer? json-pointer-expression?)
      (or/c json-pointer? json-pointer-expression?)
      jsexpr?
      jsexpr?)
  (for*/list ([n (in-naturals)]
              [v (in-value (json-pointer-value/false
                            (append pre
                                    (list (number->string n))
                                    post)
                            jsexpr))]
              #:break (not v))
    v))

From HtDP to Racket. Racket (5): I/0, optional and keyword args, point-free style

As a final step, we are going to refactor a bit the prior code to show other useful Racket features.

First, we can generalize the exported functions to take any input in general and not only strings. This can be useful for clients that, for instance, prefer to pass as argument a text file.

The client then would call find-isbn or find-isbn-list as follows (or in a similar way):

(with-input-from-file a-file-name isbn-find/list)

but the client could still pass strings with

(with-input-from-string a-string isbn-find/list)

Those with-input functions are provided by racket/port.

For more information about them see Input and Output [The Racket Guide], as well as the documentation for the functions mentioned.

Another commonly used feature is defining functions with optional arguments and/or with named (aka. keyword) arguments.

In the current refactoring the find functions can receive one or two optional arguments, one of them named. Note that contracts for this kind of functions are special: they use the construct ->*.

For more information about function definitions with optional or keyword arguments see Functions (Procedures) [The Racket Guide].

Finally, as a cosmetic touch, I have refactored the two last helpers isbn-match* and isbn-normalize to show a programming style called point-free style that you may find sometimes in many functional programming languages. In this style arguments are implicit and everything on the surface is just functions and function composition. To write in this way we need curry and curryr from racket/function as well as compose or compose1, a variant of compose for functions returning single values.

For more information about point-free style the Haskell wiki page on Pointfree should be helpful. For information about curry and other functional stuff, see Procedures [The Racket Reference]


#lang racket/base

(require racket/contract)

; ----------------------------------------------------------

(provide
 (contract-out
  ; extracts all isbns from the given input
  [isbn-find/list (->* () (input-port?) (listof isbn?))]
  
  ; extracts the first isbn (of given format) from given input, if any
  ; default format: 'isbn-13
  [isbn-find      (->* () (input-port? #:format isbn-format?)
                       (or/c isbn? #f))]
  
  ; determines whether the given value is a valid isbn
  [isbn?          predicate/c]

  ; determines whether the give value is an isbn format
  [isbn-format?   predicate/c]))

; ----------------------------------------------------------

(require (only-in racket/function
                  curry
                  curryr))
(require (only-in racket/match
                  match))
(require (only-in racket/sequence
                  sequence/c))
(require (only-in racket/string
                  string-normalize-spaces
                  string-replace))

(module+ test
  (require rackunit))

; ----------------------------------------------------------
; Patterns and Regexes
; [See ISBN International User Manual 7e. Sect. 5]

; - Pattern Components
(define pat-isbn-sep "[ -]")
(define pat-isbn-id "ISBN(-1[03])?:? ")
(define pat-isbn-13-id "(?:ISBN(?:-13)?:? )?")
(define pat-isbn-10-id "(?:ISBN(?:-10)?:? )?")
(define pat-isbn-prefix "97[89][ -]")
(define pat-isbn-registration "\\d{1,5}[ -]")
(define pat-isbn-registrant "\\d{1,7}[ -]")
(define pat-isbn-publication "\\d{1,6}[ -]")
(define pat-isbn-13-check "\\d")
(define pat-isbn-10-check "[X\\d]")

; - Look ahead to ISBN groups
(define pat-isbn-13-look-ahead
  (string-append "(?=" pat-isbn-prefix pat-isbn-registration ")"))

(define pat-isbn-10-look-ahead
  (string-append "(?=" pat-isbn-registration ")")) 

; - Main patterns
(define pat-isbn-13/groups
  (string-append pat-isbn-13-id
                 pat-isbn-13-look-ahead
                 pat-isbn-prefix
                 pat-isbn-registration
                 pat-isbn-registrant
                 pat-isbn-publication
                 pat-isbn-13-check))

(define pat-isbn-10/groups
  (string-append pat-isbn-10-id
                 pat-isbn-10-look-ahead
                 pat-isbn-registration
                 pat-isbn-registrant
                 pat-isbn-publication
                 pat-isbn-10-check))

(define pat-isbn-13/prefix
  (string-append pat-isbn-13-id pat-isbn-prefix "\\d{10}"))

(define pat-isbn-13-norm "\\d{13}")

(define pat-isbn-10-norm "\\d{9}[X\\d]")

(define pat-isbn-13
  (string-append pat-isbn-13-norm "|"
                 pat-isbn-13/prefix "|"
                 pat-isbn-13/groups))

(define pat-isbn-10
  (string-append pat-isbn-10-norm "|"
                 pat-isbn-10/groups))

(define pat-isbn
  (string-append pat-isbn-13 "|"
                 pat-isbn-10))

; - Regexes
(define re-isbn-id (regexp pat-isbn-id))
(define re-isbn-sep (regexp pat-isbn-sep))
(define re-isbn-13 (pregexp pat-isbn-13))
(define re-isbn-10 (pregexp pat-isbn-10))
(define re-isbn-13-norm (pregexp pat-isbn-13-norm))
(define re-isbn-10-norm (pregexp pat-isbn-10-norm))
(define re-isbn (pregexp pat-isbn))

; ----------------------------------------------------------
; Predicates

(module+ test
  (check-true (isbn? "9781593274917"))
  (check-true (isbn? "0262062186"))
  (check-false (isbn? #f))
  
  (check-true (isbn-13? "9781593274917"))
  (check-false (isbn-13? "0262062186"))
  (check-false (isbn-13? ""))
  
  (check-false (isbn-10? "9781593274917"))
  (check-true (isbn-10? "0262062186"))
  (check-false (isbn-10? 1))
  
  (check-true (isbn-string? "9781593274912"))
  (check-true (isbn-string? "026206218X"))
  (check-false (isbn-string? "97815932749122")) ;too long
  (check-false (isbn-string? "978159327491"))   ;too short
  (check-false (isbn-string? "0262062189X"))    ;too long
  (check-false (isbn-string? "026206218"))      ;too short
  (check-false (isbn-string? "0-262-06218-6"))
  
  (check-true (isbn-13-string? "9781593274912"))
  (check-false (isbn-13-string? "97815932749122")) ;too long
  (check-false (isbn-13-string? "978159327491"))   ;too short
  (check-false (isbn-13-string? #f))
  
  (check-true (isbn-10-string? "026206218X"))
  (check-false (isbn-10-string? "0262062189X")) ;too long
  (check-false (isbn-10-string? "026206218"))   ;too short
  (check-false (isbn-10-string? #f))
  
  (check-true (isbn-format? 'isbn-13))
  (check-true (isbn-format? 'isbn-10))
  (check-false (isbn-format? "isbn-10")))
  
(define (isbn? v)
  (or (isbn-13? v) (isbn-10? v)))

(define (isbn-13? v)
  (and (isbn-13-string? v) (isbn-13-valid? v)))

(define (isbn-10? v)
  (and (isbn-10-string? v) (isbn-10-valid? v)))

(define (isbn-string? v)
  (or (isbn-13-string? v) (isbn-10-string? v)))

(define (isbn-13-string? v)
  (and (string? v) (regexp-match-exact? re-isbn-13-norm v)))

(define (isbn-10-string? v)
  (and (string? v) (regexp-match-exact? re-isbn-10-norm v)))

(define (isbn-format? v)
  (or (equal? v 'isbn-13) (equal? v 'isbn-10)))

; ----------------------------------------------------------
; ISBN Validation

; ISBN-13-String -> Boolean
; is the given isbn-13 string a valid isbn-13

(module+ test
  (check-true (isbn-13-valid? "9781593274917"))
  (check-true (isbn-13-valid? "9780201896831"))
  (check-false (isbn-13-valid? "9781593274912"))
  (check-false (isbn-13-valid? "9780201896834")))

(define/contract (isbn-13-valid? isbn-str)
  (-> isbn-13-string? boolean?)
  (isbn-checksumf (in-list (isbn-string->numbers isbn-str))
                  (in-cycle '(1 3))
                  10))

; ISBN-10-String -> Boolean
; is the given isbn-10 string a valid isbn-10

(module+ test
  (check-true (isbn-10-valid? "0262062186"))
  (check-true (isbn-10-valid? "026256114X"))
  (check-false (isbn-10-valid? "026206218X"))
  (check-false (isbn-10-valid? "0262561141")))

(define/contract (isbn-10-valid? isbn-str)
  (-> isbn-10-string? boolean?)
  (isbn-checksumf (in-list (isbn-string->numbers isbn-str))
                  (in-range 10 0 -1)
                  11))

; [Sequence-of N] [Sequence-of N] N -> Boolean
; abstract checksum algorithm for isbn validation

(module+ test
  (check-true (isbn-checksumf '(9 7 8 1 5 9 3 2 7 4 9 1 7)
                              (in-cycle '(1 3))
                              10))
  (check-false (isbn-checksumf '(0 2 6 2 0 6 2 1 8 10)
                               (in-range 10 0 -1)
                               11)))

(define/contract (isbn-checksumf multiplicands multipliers mod)
  (-> (sequence/c natural-number/c)
      (sequence/c natural-number/c)
      natural-number/c
      boolean?)
  (define sum
    (for/sum ([x multiplicands]
              [y multipliers])
      (* x y)))
  (zero? (modulo sum mod)))

; ISBN-String -> [List-of N]
; translates str into the numbers their isbn letters represent

(module+ test
  (check-equal? (isbn-string->numbers "026256114X")
                '(0 2 6 2 5 6 1 1 4 10)))

(define/contract (isbn-string->numbers str)
  (-> isbn-string? (listof natural-number/c))
  (for/list ([char (in-string str)])
    (match char
      [#\X 10]
      [_ (string->number (string char))])))

; ----------------------------------------------------------
; ISBN Extraction

; [Input-Port (current-input-port)] -> [List-of ISBN]
; extracts all isbns from in

(module+ test
  (require racket/port)
  
  (check-equal? (with-input-from-string "" isbn-find/list) '())
  (check-equal? (with-input-from-string "none" isbn-find/list) '())
  (check-equal?
   (with-input-from-file "test-isbn-examples" isbn-find/list)
   (list
    ;isbn normalized
    "0262062186" "026256114X" "1593274912"
    "9781593274917" "0201896834" "9780201896831"
    ;isbn w/ several id's a sep's
    "0262062186" "026256114X" "0262062186" "0201896834"
    "9780201896831" "026256114X" "9780201896831")))

(define (isbn-find/list [in (current-input-port)])
  (for*/list ([line (in-lines in)]
              [candidate (in-list (isbn-match* line))]
              [isbn-str (in-value (isbn-normalize candidate))]
              #:when (isbn? isbn-str))
    isbn-str))

; [Input-Port (current-input-port)] [ISBN-Format 'isbn-13]
; -> [Maybe ISBN]
; extracts the first isbn of given format from in, if any
; default format: 'isbn-13

(module+ test
  (check-false (with-input-from-string "" isbn-find))
  (check-false (with-input-from-string "0262062186" isbn-find))
  (check-false (with-input-from-string "0262062186"
                 (curry isbn-find #:format 'isbn-13)))
  (check-false (with-input-from-string "9781593274917"
                 (curry isbn-find #:format 'isbn-10)))
  (check-equal? (with-input-from-file "test-isbn-examples"
                  (curry isbn-find #:format 'isbn-10))
                "0262062186")
  (check-equal? (with-input-from-file "test-isbn-examples"
                  (curry isbn-find #:format 'isbn-13))
                "9781593274917"))

(define (isbn-find [in (current-input-port)]
                   #:format [isbn-format 'isbn-13])
  (define p?
    (match isbn-format
      ['isbn-13 isbn-13?]
      ['isbn-10 isbn-10?]))
  (for*/or ([line (in-lines in)]
            [candidate (in-list (isbn-match* line))]
            [isbn-str (in-value (isbn-normalize candidate))]
            #:when (p? isbn-str))
    isbn-str))

; ----------------------------------------------------------
; Helpers

; String -> [List-of String]
; matches substrings in the given string looking like isbn tags

(module+ test
  (require racket/file)
  
  (check-equal? (isbn-match* "") '())
  (check-equal? (isbn-match* "abc\nd") '())
  (check-equal?
   (isbn-match* (file->string "test-isbn-examples"))
   (list
    ;isbn normalized (all matched)
    "0262062186" "026256114X" "1593274912"
    "9781593274917" "0201896834" "9780201896831"
    ;isbn w/ several id's and sep's (all matched)
    "ISBN 0-262-06218-6"
    "ISBN: 0 262 56114 X"
    "ISBN-10 0 262 06218-6"
    "ISBN-10: 0-201-89683-4"
    "ISBN-13: 978-0-201-89683-1"
    "ISBN-10: 0 262 56114 X"
    "ISBN-13: 978-0201896831"
    ;not isbn strings (usually impossible in real-world)
    "026206218X" "0262561141" "9780201896834" ;partially matched
    ;isbn strings, but isbn invalid (all matched)
    "026206218X" "0262561141" "1593274913"
    "9781593274912" "0201896833" "9780201896834")))

(define/contract isbn-match*
  (-> string? (listof string?))
  (compose1 (curry regexp-match* re-isbn)
            string-normalize-spaces))

; String -> String
; removes the isbn-id and, then, the isbn separators from str

(module+ test
  (check-equal? (isbn-normalize "123-45 67") "1234567")
  (check-equal? (isbn-normalize "ISBN 123") "123")
  (check-equal? (isbn-normalize "ISBN: 123") "123")
  (check-equal? (isbn-normalize "ISBN-10 123") "123")
  (check-equal? (isbn-normalize "ISBN-10: 123") "123")
  (check-equal? (isbn-normalize "ISBN-13 123") "123")
  (check-equal? (isbn-normalize "ISBN-13: 123") "123")
  (check-equal? (isbn-normalize "ISBN-14: hi") "ISBN14:hi"))

(define/contract isbn-normalize
  (-> string? string?)
  (compose1 (curryr string-replace re-isbn-sep "")
            (curryr string-replace re-isbn-id "")))

From HtDP to Racket. Racket (4): provide, contracts

The next step involves a substantial transformation and the first touch on a prominent Racket feature: contracts.

When you create code that may be used as a library by other client code, you probably don't want to make public everything there. It is more likely that only certain parts are written for client use while the rest is only for the implementation. In order to specify the functions, predicates or whatever you want to export for public use, you have to add an initial section with provide that states and documents from the very beginning the public interface.

Furthermore, you surely wish to clearly manifest what your exported functions expect as input and what they guarantee as the type of the output. Signatures are for that, but signatures don't receive actual checking of any kind. Contracts are precisely what you wish for. So in code that you write for others include contracts about everything you are going to export.

Besides, you may wish to use contracts locally in all functions, including the private ones. In such a case you can use define/contract.

In the code below the first part is now the provide section with contracts for all exported functions. The rest of the functions are now defined with define/contract just for more practicing on function contracts.

Finally, since predicates and contracts are sufficient documentation for the code at hand I have also removed the Data Types section in prior versions.

Contracts can be overwhelming at first sight. Take your time and consult the section on Contracts [The Racket Guide] and go to the Contracts [The Racket Reference] for further details.

As for provide see Exports: provide [The Racket Guide] and Importing and Exporting [The Racket Reference]

As an aside, a new require has been added in order to use the contract sequence/c provided by racket/sequence in the definition of isbn-checksumf.


; ----------------------------------------------------------
; isbn-racket.v4.rkt
; - provide
; - contracts: contract-out, define/contract, ...
; - racket/contract
; - racket/sequence
; ----------------------------------------------------------

#lang racket/base

(require racket/contract)

; ----------------------------------------------------------

(provide
 (contract-out
  ; extracts all isbns from the given input
  [isbn-find/list (-> string? (listof isbn?))]
  
  ; extracts the first isbn (of given format) from given input, if any
  ; default format: 'isbn-13
  [isbn-find      (-> string? isbn-format? (or/c isbn? #f))]
  
  ; determines whether the given value is a valid isbn
  [isbn?          predicate/c]

  ; determines whether the give value is an isbn format
  [isbn-format?   predicate/c]))

; ----------------------------------------------------------

(require (only-in racket/match
                  match))
(require (only-in racket/sequence
                  sequence/c))
(require (only-in racket/string
                  string-normalize-spaces 
                  string-replace
                  string-split))

(module+ test
  (require rackunit)
  (require (only-in racket/file
                    file->string)))

; ----------------------------------------------------------
; Patterns and Regexes
; [See ISBN International User Manual 7e. Sect. 5]

; - Pattern Components
(define pat-isbn-sep "[ -]")
(define pat-isbn-id "ISBN(-1[03])?:? ")
(define pat-isbn-13-id "(?:ISBN(?:-13)?:? )?")
(define pat-isbn-10-id "(?:ISBN(?:-10)?:? )?")
(define pat-isbn-prefix "97[89][ -]")
(define pat-isbn-registration "\\d{1,5}[ -]")
(define pat-isbn-registrant "\\d{1,7}[ -]")
(define pat-isbn-publication "\\d{1,6}[ -]")
(define pat-isbn-13-check "\\d")
(define pat-isbn-10-check "[X\\d]")

; - Look ahead to ISBN groups
(define pat-isbn-13-look-ahead
  (string-append "(?=" pat-isbn-prefix pat-isbn-registration ")"))

(define pat-isbn-10-look-ahead
  (string-append "(?=" pat-isbn-registration ")")) 

; - Main patterns
(define pat-isbn-13/groups
  (string-append pat-isbn-13-id
                 pat-isbn-13-look-ahead
                 pat-isbn-prefix
                 pat-isbn-registration
                 pat-isbn-registrant
                 pat-isbn-publication
                 pat-isbn-13-check))

(define pat-isbn-10/groups
  (string-append pat-isbn-10-id
                 pat-isbn-10-look-ahead
                 pat-isbn-registration
                 pat-isbn-registrant
                 pat-isbn-publication
                 pat-isbn-10-check))

(define pat-isbn-13/prefix
  (string-append pat-isbn-13-id pat-isbn-prefix "\\d{10}"))

(define pat-isbn-13-norm "\\d{13}")

(define pat-isbn-10-norm "\\d{9}[X\\d]")

(define pat-isbn-13
  (string-append pat-isbn-13-norm "|"
                 pat-isbn-13/prefix "|"
                 pat-isbn-13/groups))

(define pat-isbn-10
  (string-append pat-isbn-10-norm "|"
                 pat-isbn-10/groups))

(define pat-isbn
  (string-append pat-isbn-13 "|"
                 pat-isbn-10))

; - Regexes
(define re-isbn-id (regexp pat-isbn-id))
(define re-isbn-sep (regexp pat-isbn-sep))
(define re-isbn-13 (pregexp pat-isbn-13))
(define re-isbn-10 (pregexp pat-isbn-10))
(define re-isbn-13-norm (pregexp pat-isbn-13-norm))
(define re-isbn-10-norm (pregexp pat-isbn-10-norm))
(define re-isbn (pregexp pat-isbn))

; ----------------------------------------------------------
; Predicates

(module+ test
  (check-true (isbn? "9781593274917"))
  (check-true (isbn? "0262062186"))
  (check-false (isbn? #f))
  
  (check-true (isbn-13? "9781593274917"))
  (check-false (isbn-13? "0262062186"))
  (check-false (isbn-13? ""))
  
  (check-false (isbn-10? "9781593274917"))
  (check-true (isbn-10? "0262062186"))
  (check-false (isbn-10? 1))
  
  (check-true (isbn-string? "9781593274912"))
  (check-true (isbn-string? "026206218X"))
  (check-false (isbn-string? "97815932749122")) ;too long
  (check-false (isbn-string? "978159327491"))   ;too short
  (check-false (isbn-string? "0262062189X"))    ;too long
  (check-false (isbn-string? "026206218"))      ;too short
  (check-false (isbn-string? "0-262-06218-6"))
  
  (check-true (isbn-13-string? "9781593274912"))
  (check-false (isbn-13-string? "97815932749122")) ;too long
  (check-false (isbn-13-string? "978159327491"))   ;too short
  (check-false (isbn-13-string? #f))
  
  (check-true (isbn-10-string? "026206218X"))
  (check-false (isbn-10-string? "0262062189X")) ;too long
  (check-false (isbn-10-string? "026206218"))   ;too short
  (check-false (isbn-10-string? #f))
  
  (check-true (isbn-format? 'isbn-13))
  (check-true (isbn-format? 'isbn-10))
  (check-false (isbn-format? "isbn-10")))
  
(define (isbn? v)
  (or (isbn-13? v) (isbn-10? v)))

(define (isbn-13? v)
  (and (isbn-13-string? v) (isbn-13-valid? v)))

(define (isbn-10? v)
  (and (isbn-10-string? v) (isbn-10-valid? v)))

(define (isbn-string? v)
  (or (isbn-13-string? v) (isbn-10-string? v)))

(define (isbn-13-string? v)
  (and (string? v) (regexp-match-exact? re-isbn-13-norm v)))

(define (isbn-10-string? v)
  (and (string? v) (regexp-match-exact? re-isbn-10-norm v)))

(define (isbn-format? v)
  (or (equal? v 'isbn-13) (equal? v 'isbn-10)))

; ----------------------------------------------------------
; ISBN Validation

; ISBN-13-String -> Boolean
; is the given isbn-13 string a valid isbn-13

(module+ test
  (check-true (isbn-13-valid? "9781593274917"))
  (check-true (isbn-13-valid? "9780201896831"))
  (check-false (isbn-13-valid? "9781593274912"))
  (check-false (isbn-13-valid? "9780201896834")))

(define/contract (isbn-13-valid? isbn-str)
  (-> isbn-13-string? boolean?)
  (isbn-checksumf (in-list (isbn-string->numbers isbn-str))
                  (in-cycle '(1 3))
                  10))

; ISBN-10-String -> Boolean
; is the given isbn-10 string a valid isbn-10

(module+ test
  (check-true (isbn-10-valid? "0262062186"))
  (check-true (isbn-10-valid? "026256114X"))
  (check-false (isbn-10-valid? "026206218X"))
  (check-false (isbn-10-valid? "0262561141")))

(define/contract (isbn-10-valid? isbn-str)
  (-> isbn-10-string? boolean?)
  (isbn-checksumf (in-list (isbn-string->numbers isbn-str))
                  (in-range 10 0 -1)
                  11))

; [Sequence-of N] [Sequence-of N] N -> Boolean
; abstract checksum algorithm for isbn validation

(module+ test
  (check-true (isbn-checksumf '(9 7 8 1 5 9 3 2 7 4 9 1 7)
                              '(1 3 1 3 1 3 1 3 1 3 1 3 1)
                              10))
  (check-false (isbn-checksumf '(0 2 6 2 0 6 2 1 8 10)
                               (in-range 10 0 -1)
                               11)))

(define/contract (isbn-checksumf multiplicands multipliers mod)
  (-> (sequence/c natural-number/c)
      (sequence/c natural-number/c)
      natural-number/c
      boolean?)
  (define sum
    (for/sum [(x multiplicands)
              (y multipliers)]
      (* x y)))
  (zero? (modulo sum mod)))

; ISBN-String -> [List-of N]
; translates str into the numbers their isbn letters represent

(module+ test
  (check-equal? (isbn-string->numbers "026256114X")
                '(0 2 6 2 5 6 1 1 4 10)))

(define/contract (isbn-string->numbers str)
  (-> isbn-string? (listof natural-number/c))
  (for/list ([char (in-string str)])
    (match char
      [#\X 10]
      [_ (string->number (string char))])))

; ----------------------------------------------------------
; ISBN Extraction

; String -> [List-of ISBN]
; extracts all isbns from str

(module+ test
  (check-equal? (isbn-find/list "") '())
  (check-equal? (isbn-find/list "none") '())
  (check-equal?
   (isbn-find/list (file->string "test-isbn-examples"))
   (list
    ;isbn normalized
    "0262062186" "026256114X" "1593274912"
    "9781593274917" "0201896834" "9780201896831"
    ;isbn w/ several id's a sep's
    "0262062186" "026256114X" "0262062186" "0201896834"
    "9780201896831" "026256114X" "9780201896831")))

(define (isbn-find/list str)
  (for*/list ([line (in-list (string-split str "\n"))]
              [candidate (in-list (isbn-match* line))]
              [isbn-str (in-value (isbn-normalize candidate))]
              #:when (isbn? isbn-str))
    isbn-str))

; String ISBN-Format -> [Maybe ISBN]
; extracts the first isbn of given format from str, if any

(module+ test
  (check-false (isbn-find "" 'isbn-13))
  (check-false (isbn-find "" 'isbn-10))
  (check-false (isbn-find "0262062186" 'isbn-13))
  (check-false (isbn-find "9781593274917" 'isbn-10))
  (check-equal? (isbn-find (file->string "test-isbn-examples")
                           'isbn-13)
                "9781593274917")
  (check-equal? (isbn-find (file->string "test-isbn-examples")
                           'isbn-10)
                "0262062186"))

(define (isbn-find str format)
  (define p?
    (match format
      ['isbn-13 isbn-13?]
      ['isbn-10 isbn-10?]))
  (for*/or ([line (in-list (string-split str "\n"))]
            [candidate (in-list (isbn-match* line))]
            [isbn-str (in-value (isbn-normalize candidate))]
            #:when (p? isbn-str))
    isbn-str))

; ----------------------------------------------------------
; Helpers

; String -> [List-of String]
; matches substrings in the given string looking like isbn tags

(module+ test
  (check-equal? (isbn-match* "") '())
  (check-equal? (isbn-match* "abc\nd") '())
  (check-equal?
   (isbn-match* (file->string "test-isbn-examples"))
   (list
    ;isbn normalized (all matched)
    "0262062186" "026256114X" "1593274912"
    "9781593274917" "0201896834" "9780201896831"
    ;isbn w/ several id's and sep's (all matched)
    "ISBN 0-262-06218-6"
    "ISBN: 0 262 56114 X"
    "ISBN-10 0 262 06218-6"
    "ISBN-10: 0-201-89683-4"
    "ISBN-13: 978-0-201-89683-1"
    "ISBN-10: 0 262 56114 X"
    "ISBN-13: 978-0201896831"
    ;not isbn strings (usually impossible in real-world)
    "026206218X" "0262561141" "9780201896834" ;partially matched
    ;isbn strings, but isbn invalid (all matched)
    "026206218X" "0262561141" "1593274913"
    "9781593274912" "0201896833" "9780201896834")))

(define/contract (isbn-match* str)
  (-> string? (listof string?))
  (regexp-match* re-isbn (string-normalize-spaces str)))

; String -> String
; removes the isbn-id and, then, the isbn separators from str

(module+ test
  (check-equal? (isbn-normalize "123-45 67") "1234567")
  (check-equal? (isbn-normalize "ISBN 123") "123")
  (check-equal? (isbn-normalize "ISBN: 123") "123")
  (check-equal? (isbn-normalize "ISBN-10 123") "123")
  (check-equal? (isbn-normalize "ISBN-10: 123") "123")
  (check-equal? (isbn-normalize "ISBN-13 123") "123")
  (check-equal? (isbn-normalize "ISBN-13: 123") "123")
  (check-equal? (isbn-normalize "ISBN-14: hi") "ISBN14:hi"))

(define/contract (isbn-normalize str)
  (-> string? string?)
  (string-replace
   (string-replace str re-isbn-id "") re-isbn-sep ""))

Next article in the series: Racket: I/O, optional and keyword arguments