Academic Year: Fall/Winter 2013/2014

Term: 1

Day/Evening: D

Instructor: Prof. Andra Willits

Email: willits@mcmaster.ca

Office: Togo Salmon Hall 602

Phone: 905-525-9140 x 23761


Office Hours: Thursday 9:30-10:30, or by appointment

Course Objectives:

By the end of the course, students should be able to:

  • Write simple Python programs to manipulate and analyze language data
  • Make use of existing tools to add powerful means of exploring large text corpora
  • Understand and demonstrate how to use data structures and algorithms in Natural Language Processing applications.
  • How to store data and use data to evaluate the performance of NLP techniques

Textbooks, Materials & Fees:

Steven Bird, Ewan Klein, and Edward Loper, Natural Language Processing with Python. Cambridge: O'Reilly, 2009.

Method of Assessment:

Assignments (3) 45%

[Due: October 6, October 27, November 27]

    There will be 3 programming assignments spaced evenly throughout the term.  Students are     expected to complete them by their indicated due date and submit them electronically to the     instructor.

Midterm Examination 15%

[Thursday November 7]

    There will be a 50 minute midterm to be held during class.  

Final Examination 40%

    The final examination will be held during the examination period at the end of the semester.


Policy on Missed Work, Extensions, and Late Penalties:

Written Work and Late Submissions:

Assignments must be submitted on the due date.  Late assignments will be penalized 5% per day including weekends.  Late penalties will not be waived unless you provide your Faculty/Program office with the appropriate documentation to support your inability to submit the work by the due date.


McMaster Student Absence Form (MSAF)

This is a self-reporting tool for undergraduate students to report absences DUE TO MINOR MEDICAL SITUATIONS that last up to 5 days and provides the ability to request accommodation for any missed academic work. Please note, this tool cannot be used during any final examination period. You may submit a maximum of 1 Academic Work Missed request per term. It is YOUR responsibility to follow up with your Instructor immediately (NORMALLY WITHIN TWO WORKING DAYS) regarding the nature of the accommodation. If you are absent for reasons other than medical reasons, for more than 5 days, or exceed 1 request per term, you MUST visit your Associate Dean's Office/Faculty Office). You may be required to provide supporting documentation. This form should be filled out immediately when you are about to return to class after your absence.

Please Note the Following Policies and Statements:

Academic Dishonesty

You are expected to exhibit honesty and use ethical behaviour in all aspects of the learning process. Academic credentials you earn are rooted in principles of honesty and academic integrity.

Academic dishonesty is to knowingly act or fail to act in a way that results or could result in unearned academic credit or advantage. This behaviour can result in serious consequences, e.g. the grade of zero on an assignment, loss of credit with a notation on the transcript (notation reads: "Grade of F assigned for academic dishonesty"), and/or suspension or expulsion from the university.

It is your responsibility to understand what constitutes academic dishonesty. For information on the various types of academic dishonesty please refer to the Academic Integrity Policy, located at www.mcmaster.ca/academicintegrity

The following illustrates only three forms of academic dishonesty:

  1. Plagiarism, e.g. the submission of work that is not one’s own or for which other credit has been obtained.
  2. Improper collaboration in group work.
  3. Copying or using unauthorized aids in tests and examinations.

Email correspondence policy

It is the policy of the Faculty of Humanities that all email communication sent from students to instructors (including TAs), and from students to staff, must originate from each student’s own McMaster University email account. This policy protects confidentiality and confirms the identity of the student.  Instructors will delete emails that do not originate from a McMaster email account.

Modification of course outlines

The University reserves the right to change dates and/or deadlines etc. for any or all courses in the case of an emergency situation or labour disruption or civil unrest/disobedience, etc. If a modification becomes necessary, reasonable notice and communication with the students will be given with an explanation and the opportunity to comment on changes. Any significant changes should be made in consultation with the Department Chair.

Academic Accommodation of Students with Disabilities

Students who require academic accommodation must contact Student Accessibility Services (SAS) to make arrangements with a Program Coordinator. Academic accommodations must be arranged for each term of study. Student Accessibility Services can be contacted by phone 905-525-9140 ext. 28652 or e-mail sas@mcmaster.ca. For further information, consult McMaster University's Policy for Academic Accommodation of Students with Disabilities.

Academic Accommodation for Religious, Indigenous and Spiritual Observances

Students requiring academic accommodation based on religion and spiritual observances should follow the procedures set out in the Course Calendar or by their respective Faculty. In most cases, the student should contact his or her professor or academic advisor as soon as possible to arrange accommodations for classes, assignments, tests and examinations that might be affected by a religious holiday or spiritual observance.

Topics and Readings:

Schedule of Readings and Lectures

Note: At certain points in the course it may make sense to modify the schedule outlined below.  The instructor reserves the right to modify elements of the course and will notify students accordingly.

Week 1

Thursday September 5    Introduction to course

Week 2

Monday September 9      Language Processing using Python

Thursday September 12     Computing Statistics; Automatic NLP: The big picture; Challenges of NLP

                Reading: Chapter 1

Week 3

Monday September 16     Text Corpora

Thursday September 19     Lexical Resources; WordNet

                Reading: Chapter 2

Week 4

Monday September 23     Processing Raw Text

Thursday September 26     Regular Expressions and Normalizing Text

                Reading: Chapter 3

Week 5

Monday September 30     Writing Structured Programs: Python concepts, Object Oriented                 Programming

Thursday October 3         Algorithm Design & Debugging

                Reading: Chapter 4


                Assignment 1 (Chapters 1,2,3)

                Due: Sunday October 6


Week 6

Monday October 7         Part of Speech Tagging, Corpora and Python Dictionaries

Thursday October 10         Automatic Tagging

                Reading: Chapter 5

Week 7

Monday October 14        Thanksgiving. No class.

Thursday October 17         Supervised Classification of Text; Evaluation

                Reading: Chapter 6

Week 8

Monday October 21         Machine Learning Techniques

Thursday October 24         Extracting Information from Text: Chunking

                Reading: Chapter 7


                Assignment 2 (Chapter 4,5,6)

                Due: Sunday October 27


Week 9

Monday October 28         Named Entity Recognition; Midterm Review

Thursday October 31        Midterm Recess. No class.

                Reading: No reading this week

Week 10

Monday November 4         Midterm Review, Catchup 

Thursday November 7     Midterm

                Reading: No reading this week

Week 11

Monday November 11     Analyzing Sentence Structure: Context Free Grammar

Thursday November 14    Analyzing Sentence Structure: Dependancy Grammar; Grammar Development

                Reading: Chapter 8

Week 12

Monday November 18     Building Feature Based Grammars

Thursday November 21     Builidng Feature Based Grammars

                Reading: Chapter 9


                Assignment 3 (Chapter 7,8,9)

                Due: Wednesday November 27


Week 13

Monday November 25     Analyzing the Meaning of Sentences: Logic

Thursday November 28     Semantics of English Sentences

                Reading: Chapter 10

Week 14

Monday December 2        Summary and Review


Other Course Information:

Course Format:

  • The class will meet twice a week.  On Mondays, our class will run for 2 hours.  The first hour will consist of introducing new Python and NLP concepts.  The second hour, we will work through exercises on the computer so that the students can apply their Python knowledge in a hands on manner, allowing the instructor to assist with any programming difficulties.  On Thursday, the hour will be spent exploring additional concepts through lecture slides.
  • There will be 3 programming assignments, a written midterm and final examination to test the skills obtained during the course.