Assignment 2: Designing methods for complex data
Goals: Learn to design methods for complex class hierarchies. Practice designing the representation of complex data.
Instructions
the names of classes,
the names and types of the fields within classes,
the names, types and order of the arguments to the constructor,
the names, types and order of arguments to methods, or
filenames,
You will submit this assignment by the deadlines using the course handin server. Follow A Complete Guide to the Handin Server for information on how to use the handin server. You may submit as many times as you wish. Be aware of the fact that close to the deadline the server may slow down to handle many submissions, so try to finish early. There will be a separate submission for each problem - it makes it easier to grade each problem, and to provide you with the feedback for each problem you work on.
Remember that you should avoid accessing fields of fields and using any type-checkers. Design your methods systematically using the Design Recipe as we have been doing in class!
As always, you may only use techniques that have been covered in lectures so far in your solutions.
Part 1: Monday, January 20th, 9:00 pm (with self-eval due Tuesday January 21st by 10pm)
Part 2: Thursday, January 23rd, 9:00 pm
Practice Problems
Work out these problems on your own. Save them in an electronic portfolio, so you can show them to your instructor, review them before the exam, use them as a reference when working on the homework assignments.
Problem 10.6 on page 102
Problem 11.2 on page 113
Problem 12.1 on page 125
Problem 14.7 on page 140
Problem 15.2 on page 149
Problem 15.3 on page 149
Problem 15.8 on page 171
Problem 1: “Webpages”
Part 1: Submit your data definitions (templates included), examples and tests for the totalImageSize, textLength and images methods. You should include stubs for the methods themselves, not complete implementations. This will be partly graded on the completeness of your test cases. You may add examples if there are cases you want to test that are not covered by the examples described below. Please add a comment for each test case describing what you are testing. You will also submit a self-eval by Tuesday, January 21st at 10pm. The self-eval will ask you questions about the tests you wrote.
Part 2: Submit everything from Part 1 but this time include complete implementations for the methods (including any helpers needed).
The following DrRacket data definition describes the contents of a webpage:
;;A Web Page is (make-web-page String String [Listof Item]) (define-struct web-page (title url items)) ;; An Item is one of ;; -- Text ;; -- Image ;; -- Link ;; A Text is (make-text String) (define-struct text (contents)) ;; An Image is (make-image String int String) (define-struct image (file-name size file-type)) ;; A Link is (make-link String WebPage (define-struct link (name page))
We are giving you the names of the classes or interfaces you will probably need
—
A reminder on naming conventions: For lists of data, the names of the interface should always start with ILo, while the two classes’ names start with MtLo for the empty lists and ConsLo for the nonempty lists; all three of these names should be followed by the name of the datatype of the elements of the list. So we would have ILoString, MtLoString, ConsLoString to represent lists of Strings, ILoBook, MtLoBook, ConsLoBook to represent lists of Books, etc.
Interpretive Reflection
Web pages have a natural structure that allows programs, often called bots (or robots) to go from page to page by following links. An old standard file, robots.txt, which indicates what pages or parts of websites that certain bots should not inspect, has gained new popularity recently. Why might that be?
Since 2020, increasing numbers of sites and companies have marked their sites off limits to OpenAI’s GPTBot in particular, as they believe that, unlike the “crawling” done by, e.g., Google’s search engine bot in the past, what OpenAI is doing to build its models (to power ChatGPT) constitutes actual theft of information.
Draw a class diagram for the classes that represent this data definition.
Define Java classes that represent web pages as defined above.
Describe (in English, or in a diagram, or in code...) the contents of a webpage that has at least one text, two images and three links to pages.
Design the data representation of the example you just described.
Name your examples class ExamplesWebPage
In the ExamplesWebPage class design the example of the following web pages:
WebPage: with title "Fundies II" at url "ccs.neu.edu/Fundies2"
that contains the following items:
-- text "Home sweet home"
-- image of the lab in the file "wvh-lab" of size 400
and file type "png"
-- text "The staff"
-- image of the professors in the file "profs" of size 240
and file type "jpeg"
-- link with the label "A Look Back" to the web page named "HtDP"
-- link with the label "A Look Ahead" to the web page named "OOD"
WebPage: with the title "HtDP" at url "htdp.org"
that contains the following items:
-- text "How to Design Programs"
-- image of the book cover in the file "htdp" of size 4300
and file type "tiff"
WebPage: with the title "OOD" at url "ccs.neu.edu/OOD"
that contains the following items:
-- text "Stay classy, Java"
-- link with the label "Back to the Future" to the
web page named "HtDP"
Name the "Fundies II" example fundiesWP. Our test program will check that the field fundiesWP in the class ExamplesWebPage represents this information. (You may name the other two web pages, and all the items inside them, anything you like, though the names should be reasonably descriptive.)
Design the method totalImageSize that computes the total size of all images in the fundiesWP web page and all web pages that are linked to it.
Design the method textLength that computes the number of letters in all text that appears on the web site starting at this web page. This includes the contents of the text, the names of the image files plus the file type (but not the typical dot that is used in the full file name), the labels for links, and the titles of the web pages.
Tricky! Design the method images that produces one String that has in it all names of images on this web page and all web pages linked to it, given with their file types, and separated by comma and space.
So for the example above this String would be
"wvh-lab.png, profs.jpeg, htdp.tiff, htdp.tiff" Note: You can combine two Strings with a + operator, or by invoking the method concat (e.g. s1.concat(s2) produces a new String appending the String s2 to the String s1.)
Note: There is a comma and space between any two entries, but not after the last one.
We will eventually see how to prevent this duplication from happening, but not for quite a while!
In a comment in your file, explain why htdp.tiff appears twice in the results of the images method. Also tell us if there are any other places in your code where this duplication occurs. (Hint: look very carefully at the output from the tester library when it shows your data.)
Submit your work in a file called WebPage.java