Contact a Humanities Office or Academic unit.
Find your course outlines.


Academic Year: Fall/Winter 2013/2014

Term: 1

Day/Evening: D

Instructor: Prof. Andra Willits


Office: Togo Salmon Hall 602

Phone: 905-525-9140 x 23761

Office Hours: Thursday 9:30-10:30, or by appointment

Course Objectives:

By the end of the course, students should be able to:

  • Write simple Python programs to manipulate and analyze language data
  • Make use of existing tools to add powerful means of exploring large text corpora
  • Understand and demonstrate how to use data structures and algorithms in Natural Language Processing applications.
  • How to store data and use data to evaluate the performance of NLP techniques

Textbooks, Materials & Fees:

Steven Bird, Ewan Klein, and Edward Loper, Natural Language Processing with Python. Cambridge: O'Reilly, 2009.

Method of Assessment:

Assignments (3) 45%

[Due: October 6, October 27, November 27]

    There will be 3 programming assignments spaced evenly throughout the term.  Students are     expected to complete them by their indicated due date and submit them electronically to the     instructor.

Midterm Examination 15%

[Thursday November 7]

    There will be a 50 minute midterm to be held during class.  

Final Examination 40%

    The final examination will be held during the examination period at the end of the semester.


Policy on Missed Work, Extensions, and Late Penalties:

Written Work and Late Submissions:

Assignments must be submitted on the due date.  Late assignments will be penalized 5% per day including weekends.  Late penalties will not be waived unless you provide your Faculty/Program office with the appropriate documentation to support your inability to submit the work by the due date.


McMaster Student Absence Form (MSAF)

This is a self-reporting tool for undergraduate students to report absences DUE TO MINOR MEDICAL SITUATIONS that last up to 5 days and provides the ability to request accommodation for any missed academic work. Please note, this tool cannot be used during any final examination period. You may submit a maximum of 1 Academic Work Missed request per term. It is YOUR responsibility to follow up with your Instructor immediately (NORMALLY WITHIN TWO WORKING DAYS) regarding the nature of the accommodation. If you are absent for reasons other than medical reasons, for more than 5 days, or exceed 1 request per term, you MUST visit your Associate Dean's Office/Faculty Office). You may be required to provide supporting documentation. This form should be filled out immediately when you are about to return to class after your absence.

Please Note the Following Policies and Statements:

Academic Integrity

You are expected to exhibit honesty and use ethical behaviour in all aspects of the learning process. Academic credentials you earn are rooted in principles of honesty and academic integrity. It is your responsibility to understand what constitutes academic dishonesty.

Academic dishonesty is to knowingly act or fail to act in a way that results or could result in unearned academic credit or advantage. This behaviour can result in serious consequences, e.g. the grade of zero on an assignment, loss of credit with a notation on the transcript (notation reads: "Grade of F assigned for academic dishonesty"), and/or suspension or expulsion from the university. For information on the various types of academic dishonesty please refer to the Academic Integrity Policy, located at

The following illustrates only three forms of academic dishonesty:

  • plagiarism, e.g. the submission of work that is not one’s own or for which other credit has been obtained.
  • improper collaboration in group work.
  • copying or using unauthorized aids in tests and examinations.

Authenticity / Plagiarism Detection

Some courses may use a web-based service ( to reveal authenticity and ownership of student submitted work. For courses using such software, students will be expected to submit their work electronically either directly to or via Avenue to Learn (A2L) plagiarism detection (a service supported by so it can be checked for academic dishonesty.

Students who do not wish to submit their work through A2L and/or must still submit an electronic and/or hardcopy to the instructor. No penalty will be assigned to a student who does not submit work to or A2L. All submitted work is subject to normal verification that standards of academic integrity have been upheld (e.g., on-line search, other software, etc.). To see the Policy, please go to

Courses with an On-Line Element

Some courses use on-line elements (e.g. e-mail, Avenue to Learn (A2L), LearnLink, web pages, capa, Moodle, ThinkingCap, etc.). Students should be aware that, when they access the electronic components of a course using these elements, private information such as first and last names, user names for the McMaster e-mail accounts, and program affiliation may become apparent to all other students in the same course. The available information is dependent on the technology used. Continuation in a course that uses on-line elements will be deemed consent to this disclosure. If you have any questions or concerns about such disclosure please discuss this with the course instructor.

Online Proctoring

Some courses may use online proctoring software for tests and exams. This software may require students to turn on their video camera, present identification, monitor and record their computer activities, and/or lockdown their browser during tests or exams. This software may be required to be installed before the exam begins.

Conduct Expectations

As a McMaster student, you have the right to experience, and the responsibility to demonstrate, respectful and dignified interactions within all of our living, learning and working communities. These expectations are described in the Code of Student Rights & Responsibilities (the "Code"). All students share the responsibility of maintaining a positive environment for the academic and personal growth of all McMaster community members, whether in person or online.

It is essential that students be mindful of their interactions online, as the Code remains in effect in virtual learning environments. The Code applies to any interactions that adversely affect, disrupt, or interfere with reasonable participation in University activities. Student disruptions or behaviours that interfere with university functions on online platforms (e.g. use of Avenue 2 Learn, WebEx or Zoom for delivery), will be taken very seriously and will be investigated. Outcomes may include restriction or removal of the involved students' access to these platforms.

Academic Accommodation of Students with Disabilities

Students with disabilities who require academic accommodation must contact Student Accessibility Services (SAS) at 905-525-9140 ext. 28652 or e-mail to make arrangements with a Program Coordinator. For further information, consult McMaster University’s Academic Accommodation of Students with Disabilities policy.

Email correspondence policy

It is the policy of the Faculty of Humanities that all email communication sent from students to instructors (including TAs), and from students to staff, must originate from each student’s own McMaster University email account. This policy protects confidentiality and confirms the identity of the student.  Instructors will delete emails that do not originate from a McMaster email account.

Modification of course outlines

The University reserves the right to change dates and/or deadlines etc. for any or all courses in the case of an emergency situation or labour disruption or civil unrest/disobedience, etc. If a modification becomes necessary, reasonable notice and communication with the students will be given with an explanation and the opportunity to comment on changes. Any significant changes should be made in consultation with the Department Chair.

Request for Relief for Missed Academic Term Work
McMaster Student Absence Form (MSAF)

In the event of an absence for medical or other reasons, students should review and follow the Academic Regulation in the Undergraduate Calendar "Requests for Relief for Missed Academic Term Work".

Academic Accommodation for Religious, Indigenous and Spiritual Observances (RISO)

Students requiring academic accommodation based on religious, indigenous or spiritual observances should follow the procedures set out in the RISO policy. Students should submit their request to their Faculty Office normally within 10 working days of the beginning of term in which they anticipate a need for accommodation or to the Registrar's Office prior to their examinations. Students should also contact their instructors as soon as possible to make alternative arrangements for classes, assignments, and tests.

Copyright and Recording

Students are advised that lectures, demonstrations, performances, and any other course material provided by an instructor include copyright protected works. The Copyright Act and copyright law protect every original literary, dramatic, musical and artistic work, including lectures by University instructors.

The recording of lectures, tutorials, or other methods of instruction may occur during a course. Recording may be done by either the instructor for the purpose of authorized distribution, or by a student for the purpose of personal study. Students should be aware that their voice and/or image may be recorded by others during the class. Please speak with the instructor if this is a concern for you.

Extreme Circumstances

The University reserves the right to change the dates and deadlines for any or all courses in extreme circumstances (e.g., severe weather, labour disruptions, etc.). Changes will be communicated through regular McMaster communication channels, such as McMaster Daily News, A2L and/or McMaster email.

Topics and Readings:

Schedule of Readings and Lectures

Note: At certain points in the course it may make sense to modify the schedule outlined below.  The instructor reserves the right to modify elements of the course and will notify students accordingly.

Week 1

Thursday September 5    Introduction to course

Week 2

Monday September 9      Language Processing using Python

Thursday September 12     Computing Statistics; Automatic NLP: The big picture; Challenges of NLP

                Reading: Chapter 1

Week 3

Monday September 16     Text Corpora

Thursday September 19     Lexical Resources; WordNet

                Reading: Chapter 2

Week 4

Monday September 23     Processing Raw Text

Thursday September 26     Regular Expressions and Normalizing Text

                Reading: Chapter 3

Week 5

Monday September 30     Writing Structured Programs: Python concepts, Object Oriented                 Programming

Thursday October 3         Algorithm Design & Debugging

                Reading: Chapter 4


                Assignment 1 (Chapters 1,2,3)

                Due: Sunday October 6


Week 6

Monday October 7         Part of Speech Tagging, Corpora and Python Dictionaries

Thursday October 10         Automatic Tagging

                Reading: Chapter 5

Week 7

Monday October 14        Thanksgiving. No class.

Thursday October 17         Supervised Classification of Text; Evaluation

                Reading: Chapter 6

Week 8

Monday October 21         Machine Learning Techniques

Thursday October 24         Extracting Information from Text: Chunking

                Reading: Chapter 7


                Assignment 2 (Chapter 4,5,6)

                Due: Sunday October 27


Week 9

Monday October 28         Named Entity Recognition; Midterm Review

Thursday October 31        Midterm Recess. No class.

                Reading: No reading this week

Week 10

Monday November 4         Midterm Review, Catchup 

Thursday November 7     Midterm

                Reading: No reading this week

Week 11

Monday November 11     Analyzing Sentence Structure: Context Free Grammar

Thursday November 14    Analyzing Sentence Structure: Dependancy Grammar; Grammar Development

                Reading: Chapter 8

Week 12

Monday November 18     Building Feature Based Grammars

Thursday November 21     Builidng Feature Based Grammars

                Reading: Chapter 9


                Assignment 3 (Chapter 7,8,9)

                Due: Wednesday November 27


Week 13

Monday November 25     Analyzing the Meaning of Sentences: Logic

Thursday November 28     Semantics of English Sentences

                Reading: Chapter 10

Week 14

Monday December 2        Summary and Review


Other Course Information:

Course Format:

  • The class will meet twice a week.  On Mondays, our class will run for 2 hours.  The first hour will consist of introducing new Python and NLP concepts.  The second hour, we will work through exercises on the computer so that the students can apply their Python knowledge in a hands on manner, allowing the instructor to assist with any programming difficulties.  On Thursday, the hour will be spent exploring additional concepts through lecture slides.
  • There will be 3 programming assignments, a written midterm and final examination to test the skills obtained during the course.