WARNING: This series is deprecated. The up-to-date version can be found here
TOC of the series
If you have read the book How to Design Programs (HtDP) you already know some parts of Racket, the language on which BSL, ISL, etc. are built. And you may wish to get more stuff about Racket itself and find out how to translate your HtDP skills into full Racket.
You could read the book Realm of Racket, where Racket is introduced as needed in a friendly setting familiar to HtDP readers. You could even try The Racket Guide, a gently and complete introduction to the language. But you may miss something in the middle, something not so terse as The Guide, written from (professional) programmers to (professional) programmers, but more systematic than the Realm.
That's the intent of this series of posts. Its purpose is to show by means of an example how to start the transition from HtDP languages (in particular, ISL+) to Racket. At least, what new things to consider when working directly in Racket.
An example cannot be complete at all, though. Racket is a very rich language, and sooner than later you'll need to go deeply into The Guide and The Racket Reference. So take them as a first step into Racket for whom already masters HtDP.
A public repository for the code presented in this series is available at FromHtDPtoRacket (git repo)
The sample problem we are going to tackle says as follows:
Create a program to obtain bibliographic information about pdf books. It is assumed that those pdfs are readable (meaning that they are not made of photos of pages) and, for simplicity, unencrypted.
This task can be naturally designed as a composition of the following sub-tasks:
- Read the text contained in the pdf pages.
- Search for the ISBN on those pages.
- Retrieve from a remote provider information about the book with that ISBN.
- Parse the response to represent it as a Racket data type.
The second sub-task will be implemented entirely in ISL+, with the aid of a few Racket batteries. From the ISL+ implementation a Racket version will be derived. How to go from the ISL version to the Racket version is the main subject of these posts.
For completeness, an initial solution to the rest of the steps will be given directly in Racket.
Regarding the exposition I don't want to be verbose in describing all Racket constructs presented. The aim is not to explain Racket. Instead, I'll introduce those constructs when needed, and point to places in the Racket documentation that explain them.
Let's get started with the second step, the extraction of the ISBN, if any, from an input string.
This involves, of course, domain knowledge. See https://www.isbn-international.org/content/isbn-users-manual for more information (specifically, Section 5).
In summary, an ISBN can be of two types (formats): ISBN-10 and ISBN-13. The number itself is a string of 13 or 10 digits (from '0' to '9'), respectively. ISBN-10 accepts 'X' as last digit, too. An ISBN-10 number consists of four groups of digits, three of them of variable length (the first group can have up to 5 digits; the second up to 7 digits; the third up to 6 digits), the last group is a single digit. In turn, an ISBN-13 has one more group, a prefix of 3 digits, currently: 978 or 979. Groups may appear separated by hyphens or spaces. So we could see forms like the following:
0262062186 0-262-06218-6 0 262 06218 6 9780201896831 978-0-201-89683-1 978 0 201 89683 1 978-0201896831
In printed books, that number is typically preceded by an identifier that can appear under different guises:
ISBN ISBN: ISBN-10 ISBN-13 ISBN-10: ISBN-13:
followed by a space before the number itself.
Additionally, a string matching that format is an actual ISBN if it passes a validity check, that involves an arithmetic operation. See the ISBN User Manual cited above for a description of the algorithm.
Although it is possible to devise a pure ISL+ implementation, it is better to resort to Racket regular expressions support, if only because the resulting Racket version will share the very same design with the ISL-with-regexes version. Otherwise, the ISL code would be quite different, and the translation into Racket not so seamless.
Regular expressions support is in the core of Racket, so we need to require
racket/base to use it. A warning, though: don't require
racket/base in your ISL projects. if you really need it, just get into full Racket.
We will also require a couple of libraries to apply a few functions:
racket/file for some test cases (you could use in place the function
2htdp/batch-io), and, for brevity, three string functions, that you could design on your own by following the HtDP recipe (
The first section of the code is, as HtDP mandates, a description of the data types we will use to represent the information at hand.
Next, patterns and regular expressions embedding the description of ISBN forms described above are created.
The next part contains predicates concerning all the data types of the first section. In fact, those predicates should be enough for any programmer as a precise description of the data types involved.
The rest proposes an implementation. It begins with functions to validate isbn-string's according to the required mathematical algorithm. (Note, by the way, that
isbn-checksumf is an abstract function obtained by following the design recipe explained in HtDP/2e (Part III))
The main functions are
isbn-find, which produce a list of all isbn's found in a string, or the first isbn of a given isbn-format found in a string, respectively. The basic strategy is to search for sub-strings in all the lines of the input that look like an ISBN, and select from the candidates those that, previously normalized, are actual ISBNs.
This is the code:
Next article in the series: Racket:
#lang, local definitions