UNIT 1.2. BEFORE CHOMSKY- FIELD LINGUISTS and linguists of the structuralist tradition. CORPUS BASED is impirical and observed data. TWO ASSUMPTIONS: the sentences of a natural language are finite, also can be collected and enumerated. CORPUS: only source of linguistic evidence in the formation of linguistic theories. COMSKY'S REVOLUTION(no interesa el corpus) Between 1957 y 1965 Chomsky changed the direction of linguistic from empiricism to rationalism. Intituins started to be relied on as an evidence. WHAT IS NOT A CORPUS? A list of words, a text archive, a collection of citations, a collection of quotations, a text or A WEB( it could be a corpus depending on the linguist since ones recognise it as a corpus others not, for ex Sinclair, for Kilgariff it is a corpus. KILGARIFF: people confuse what is a corpus with what a good corpus is. CORPUS ANALYSIS TOOLS: SKETCH ENGINE: (outstanding example- kilgarrif and rychly, corpus manager and text analysis tool to explore how language works, its algorithms analyze aunthentic corpora of billions of words to identify instantly what is typical in language, and what is rare. WORD SKETCH:(key concept) a one page summary ot the word's grammatical and collocational behaviour, it shows the words collocates categorized by grammatical relations such as words that serve as an object of the V, words as subject of the V. BNC: British national corpus. COCA: corpus of contemporary american english. THE SIZE of a specific corpus is the number of tokens that it has. IEMMATIZATION: reduce all the words to its based root. MARK DAVIES created coca, is an example of a generalized corpus where? BYU, BNC  is also generalized.

