SIL
UND 2010 LING 507: Computational Syntax and Morphology Calendar
NB: All materials in the links of the table below are listed in their LING 507 Computational Syntax and Morphology folders on the SIL-UND fileserver and therefore are not currently working. Sample lectures and quizzes are provided.
# |
Date |
Topic |
Assignment |
|
1 |
06/08/10 |
Introduction to Computational Linguistics
Readings
Topics covered What is Computational Linguistics Introduction to Computational Syntax FSMs Chomsky hierarchy Syntactic parsing with CFGs Parsing vs. text processing
Material presented Lecture: Introduction to Computational Syntax (presentation, pdf)
|
Assignment_June10.pdf Writing assignments: Summaries of the articles in the readings list |
|
2 |
06/14/10 |
Unix text processing and regular expressions
Readings
Topics covered Unix file management commands Unix text processing techniques Finite State morphology: FSAs and RegEx's
Material presented Lecture: FSAs and Regexs (presentation, pdf) Lecture: Regexs & FSAs part 2 (presentation, pdf) Unix file management: Handout_1.pdf Unix file management and text processing: Handout_2.pdf Unix file management and text processing: Handout_3.pdf
|
Unix_Assignment_1.pdf Unix_Assignment_2.pdf Unix_Assignment_3.pdf
Writing assignments: Summaries of the articles in the readings list |
|
3 |
06/21/10 |
Information Extraction and text processing
Readings
Topics covered Information Extraction (IE): – What is Information Extraction (IE)? – What types of terms are usually extracted? – What are two approaches to building extraction systems? – How is the output usually evaluated? – List some applications of IE Regex metacharacters More on grep processing Relevance, Recall and Precision metrics The concept of greediness in pattern matching
Material presented Lecture: Introduction to IE (presentation, pdf) Lecture: IE challenges (presentation, pdf) Handout: Relevance, P, R.pdf Handout: Regex metacharacters.pdf Handout: Dots, greediness and white spaces.pdf Unix Handout 5.pdf Unix Handout 6.pdf
Project proposals due: drafts by end of wk3
|
Unix_Assignment_5.pdf Unix_Assignment_6.pdf
Writing assignments: Summaries of the articles in the readings list |
|
4 |
06/28/10 |
Advanced text processing techniques (sed, regex's and automatic entity annotation and extraction)
Readings
Topics covered Unix text processing commands: – Removing dup's – Translating txt – Word freq's: uniq -c – A few more sed commands VisualText: – entity types vs. entity terms – entity relationships Open_CALAIS: demo Notepadd++: regex's and proper name entity recognition and extraction
Material presented Unix_Handout_7.pdf Notepad++ Regex Exercise.pdf Handout_6302010_advanced_text_processing.pdf Handout: Points to remember_Append_Uniq.pdf
Project proposals due: drafts by end of wk4
|
Assignment_7.pdf Assignment_8.pdf Assignment_9.pdf
Writing assignments: Summaries of the articles in the readings list |
|
5 |
07/06/10 |
VisualText: natural language processing and hands-on rule writing for the automatic recognition, annotation and extraction of named entities in text
Readings
Topics covered
VisualText: – How to create a new analyzer – How to select an initial template – How to reload your new analyzer – How to create an input file – How to run the analyzer and view the parse tree – How to generate built-in rules from sample data – How to create a stub for the passes and pass files – How to create concepts and concept folders for the generated rules – How to populate the concepts with sample data – How to generate built-in rules from the sample data
Extra Lab Visual Text rule-writing: – How to write your own rules for finding NPs, VPs, PPs and Possessive Phrases – How to use simple NLP++ syntax for rule-writing
Realizational morphology
Material presented Lecture: Introduction to Computational Morphology (presentation, pdf) Visual Text Tutorials: based on VisualText Help (part of the VisualText IDE) Informal mid-term course evaluation (template)
|
Assignment_07072010.pdf Assignment_VisualText_POS_tagging.pdf*
Writing assignments: Summaries of the articles in the readings list
*See class overview: week 5 class 2 Wed 07072010.pdf |
|
6 |
07/12/10 |
Corpus linguistics tools and techniques
Readings
Topics covered Unix file and dir permissions More advanced sed commands Corpus linguistics: – techniques, principles and best practices – concordancing and concordancers – lexical priming and semantic prosody – denotation vs. collocations and colligations and semantic associations – techniques for mining corpora (KWIC, comparison across different corpora, freq's, etc.) Tool: Wordsmith How To
Material presented Lecture: Introduction to Corpus Lx part1.pdf Unix_Handout_7122010.pdf Unix_Handout_7132010.pdf Handout: CorpusBased Translation using WordSmith.pdf [includes corpora mining exercises on WordSmith]
|
Assignment_07122010.pdf Assignment_07132010.pdf Corpora hands on #1.pdf
Writing assignments: Summaries of the articles in the readings list |
|
7 |
07/19/10 |
Python Regex module, advanced string manipulation, morphological parsing using FLEx, interlinear texts
Readings:
Topics covered Complex Unix strings and string manipulation Toolbox: demo FLEx: demo Python IDLE and shell Python: RE module Python: string manipulation and regex's Morphological parsing, interlinear text
Material presented Handout: Anagram_Eric_Clapton.pdf Handout_7192010.pdf: string manipulation Handout_7202010_part2.pdf: string manipulation Handout_7202010_part3.pdf: string manipulation Python_IDLE_intro_part_1.pdf Python_IDLE_intro_part_2_REs_string_man.pdf Python_IDLE_intro_part_3_more_REs&strings.pdf
|
Assignment_07192010.pdf Assignment_07202010.pdf Assignment_07212010.pdf Assignment_2_07212010.pdf Assignment_07222010.pdf
Writing assignments: Summaries of the articles in the readings list |
|
8 |
07/26/10 |
Machine Translation: shallow transfer for closely-related languages (and Python cont.)
Readings
Topics covered Machine translation: – models, approaches, evaluation metrics and challenges – shallow transfer of closely-related languages Parsing street addresses using Python Parsing phone numbers using Python Parsing NL input, extracting and tagging NL elements using Python How to write to files in Python
Material presented Lecture: Overview of MT models, approaches, challenges.pdf Handout: Python_RE_case_study_1.pdf Handout: Python_RE_case_study_2.pdf Handout: Python_RE_case_study_3_ExtractingFromSs.pdf Handout: Python_RE_case_study_4.pdf Handout: Python_NLTK.pdf Machine translation freeware tools: demo
Dissemination of final project template: Monday 7/26
Students' final project write-up's due: Friday 7/30 by 4pm
|
Python_RE_case_study_1.pdf Python_RE_case_study_2.pdf Python_RE_case_study_3_ExtractingFromSs.pdf Python_RE_case_study_4.pdf
Writing assignments: Summaries of the articles in the readings list
|
|
9 |
08/02/10 |
Student in-class presentations and project evaluations
-handouts -I/O sample -demo of proposed implementation
Final course evaluations
|
No quiz |
Copyright © Eleni Koutsomitopoulou 2010 All Rights Reserved. No part of this document may be reproduced or otherwise used without the written consent of the author.