Introduction to Computer Science II
Final Exam Practice 2
Solve the below problems by
implementing the appropriate methods in the file FinalPractice2.py.
Problems
1. Implement recursive
function posProd() that takes a list of integers as input
and returns the product of all the positive numbers in the
list; non-positive numbers are ignored. If no number is positive
then 1 should be returned. Your implementation must be recursive,
without any loops.
Usage:
>>> posProd([])
1
>>> posProd([0, -1, -4])
1
>>> posProd([0, -1, 4])
4
>>> posProd([0, 2, 4])
8
>>> posProd([0, 2, -2, 4, 3])
24
2. Implement recursive function dirSize()
that takes a pathname (as a string) as input. If the pathname refers
to a regular file, the function should return the size of
the file; if the pathname is a folder, the function should return
the sum of the sizes of all regular files contained in the
directory, whether directly in the folder or indirectly through
subfolders.
To get the size of a regular file you can use function getsize()
from module os.path that takes a file pathname as input
and returns its size. You will also need functions isfile()
and join() from module os.path and function listdir()
from module os (all of which we have used multiple times
in the course). Your implementation must be recursive
and not use any Python Standarad Library functions .
Test your function on 1) your file final.py that you are
working on and 2) the folder test obtained by downloading
the file test.zip and unziping it in the
same folder as your file FinalPractice2.py.
Usage:
>>> dirSize('FinalPractice2.py')
317
#
Note: this number will be different for you
>>> dirSize('test')
102
3. Develop class ElementCounter
as a subclass of HTMLParser that takes an HTML element tag
as input and, when fed an HTML file, counts the number of such
elements in the HTML document. The usage shown below using lists.html shows that the file contains 3
ordered lists (with tag 'ol') and 2 unordered lists (with tag 'ul').
The class ElementCounter should support method elementCount()
that takes no input arguments but returns the number of elements
with the given tag. Test your solution on lists.html
Usage:
>>> infile = open('lists.html')
>>> content = infile.read()
>>> infile.close()
>>> p = ElementCounter('ol')
>>> p.feed(content)
>>> p.elementCount()
3
>>> p = ElementCounter('ul')
>>> p.feed(content)
>>> p.elementCount()
2
4. In each of the three exercises
in function test,
construct the regular expression that attempts to match the below
described list of words in file frankenstein.txt.
Usage:
>>>
test('frankenstein.txt')
Exercise i: words that start with string 'inter' or 'Inter'
['-interest', ' interview', ' intertwined', ' interchanging',
'\ninterest', ' intercourse', '\ninteresting', ' internal',
'\nintervening', ' interfere', ' interest', '\ninterment',
'\ninterchange', ' interests', ' interruption',
'\ninterpretation', ' interference', ' intercept', '
interfered', ' interpreted', ' interspersed', ' interested', '
interpretation', ' interpreter', ' intercepted', '\nintervened',
' Interpret', ' interval', ' interesting', ' interrupt', '
interrupted', ' intermixed', ' intersected', ' interchange', '
intervals', '\ninterview', ' interment']
Exercise ii: words that start with an upper case and end with
letters 'ar'
[' Hear ', ' Dear ', ' Caesar ', ' Vicar ']
Exercise iii: words that contain the string 'death' as a
substring and end with 'e'
[]
5. Write function search() that takes a
string url as input
and then crawls through all the web pages that can be reached from the web page with URL
url. No web page should
be visited more than once. Your crawler will count the number of
hyperlinks in each visited web page and store the result in a global
dictionary variable d. At the end of the crawl, dictionary
d will contain (key: value) pairs where the keys
are URLs found through the crawl and the value for each key (URL) is
the number of hyperlinks in the web page associated with the URL.
Usage:
>>>
search('http://reed.cs.depaul.edu/lperkovic/test1.html')
>>> d
{'http://reed.cs.depaul.edu/lperkovic/test1.html': 2,
'http://reed.cs.depaul.edu/lperkovic/test2.html': 1,
'http://reed.cs.depaul.edu/lperkovic/test4.html': 0,
'http://reed.cs.depaul.edu/lperkovic/test3.html': 1}