menu
News
Calculators
Journals
Protocols
Databases
Web Links
Contact Us
About Us
Center Learning Organism Cell Simulations
Learning: Language acquisition (continued)
The aim of this project is to simulate the task of learning a language. Based upon Chomsky's X bar theory, algorithms have been developed to determine the appropriate part of speech (POS) of a particular word and to put the word in its proper context. Once the most probable POS has been determined, the word is then networked to other words within the sentence fixing a permanent relationship. In such a way, nouns are matched with adjectives which decribe them, linked to verbs, and coupled to other nouns which can modify them. All these connections are stored in mutiple arrays and indexed to the source where these links were derived.

The current implementation of this project includes 6 main components.
1. A web crawler which acquires web pages based on a specified topic.
2. A sentence parser which excises full sentences from html, omitting titles, html tags, and scripts.
3. A word parser which identifies words and retrives a dictionary entry with possible POS of the word.
4. A X bar iterative algorithm which selects the most probable POS for each word.
5. A networking subroutine which identifies and stores all possible word to word interactions i.e. adjective--> noun, adverb--> verb, noun--> verb.
6. A user interface to facilitate access to the networked database by asking questions or providing new sentences to be parsed and networked.


Back to CLOCS home




Copyright © 2006 CLOCS All rights reserved.