How do dictionaries source attestation?Are there dictionaries like Collins COBUILD for other languages than...
How to deal with an underperforming subordinate?
Is the fingering of thirds flexible or do I have to follow the rules?
Why is it that Bernie Sanders is always called a "socialist"?
Why is 'diphthong' not pronounced otherwise?
A fantasy book with seven white haired women on the cover
Does it take energy to move something in a circle?
I have trouble understanding this fallacy: "If A, then B. Therefore if not-B, then not-A."
hrule into tikz circle node
What's the oldest plausible frozen specimen for a Jurassic Park style story-line?
Minimum Viable Product for RTS game?
Is Screenshot Time-tracking Common?
Where does documentation like business and software requirement spec docs fit in an agile project?
What is a good reason for every spaceship to carry gun on board?
Why did Luke use his left hand to shoot?
How do you get out of your own psychology to write characters?
Potential client has a problematic employee I can't work with
Count repetitions of an array
Equivalent of "illegal" for violating civil law
How do I avoid the "chosen hero" feeling?
Are the positive and negative planes inner or outer planes?
Why didn't Tom Riddle take the presence of Fawkes and the Sorting Hat as more of a threat?
What can I do to encourage my players to use their consumables?
How do I narratively explain how in-game circumstances do not mechanically allow a PC to instantly kill an NPC?
Does diversity provide anything that meritocracy does not?
How do dictionaries source attestation?
Are there dictionaries like Collins COBUILD for other languages than English?How are meanings of a word ordered in a dictionary?How are dictionaries producedHow can a multi-language dictionary be made?Is there an open source English dictionary that isn't too fine-grained in defining a word?How do native speakers determine a word's literal/basic meaning?an open source lexicographical framework
Some dictionaries source attestation and try to go for the earliest quotes they can find. How do they find them? Without electronic indexing this must have been impossibly difficult.
The reason I'm asking is that many answers on Stack Exchange often conclude earliest attestation and imply earliest use, instead of the logical used at the latest in ... That's a slightly different issue to frame the question, not to make it too broad. To be fair, no inference is made because of this in many cases.
lexicography
add a comment |
Some dictionaries source attestation and try to go for the earliest quotes they can find. How do they find them? Without electronic indexing this must have been impossibly difficult.
The reason I'm asking is that many answers on Stack Exchange often conclude earliest attestation and imply earliest use, instead of the logical used at the latest in ... That's a slightly different issue to frame the question, not to make it too broad. To be fair, no inference is made because of this in many cases.
lexicography
Help, I don't know which answer to mark, they are both nearly identical.
– vectory
16 hours ago
add a comment |
Some dictionaries source attestation and try to go for the earliest quotes they can find. How do they find them? Without electronic indexing this must have been impossibly difficult.
The reason I'm asking is that many answers on Stack Exchange often conclude earliest attestation and imply earliest use, instead of the logical used at the latest in ... That's a slightly different issue to frame the question, not to make it too broad. To be fair, no inference is made because of this in many cases.
lexicography
Some dictionaries source attestation and try to go for the earliest quotes they can find. How do they find them? Without electronic indexing this must have been impossibly difficult.
The reason I'm asking is that many answers on Stack Exchange often conclude earliest attestation and imply earliest use, instead of the logical used at the latest in ... That's a slightly different issue to frame the question, not to make it too broad. To be fair, no inference is made because of this in many cases.
lexicography
lexicography
edited 13 hours ago
Peter Mortensen
1032
1032
asked 22 hours ago
vectoryvectory
35211
35211
Help, I don't know which answer to mark, they are both nearly identical.
– vectory
16 hours ago
add a comment |
Help, I don't know which answer to mark, they are both nearly identical.
– vectory
16 hours ago
Help, I don't know which answer to mark, they are both nearly identical.
– vectory
16 hours ago
Help, I don't know which answer to mark, they are both nearly identical.
– vectory
16 hours ago
add a comment |
3 Answers
3
active
oldest
votes
I agree with your assumption that the date of the earliest recorded usage of a word does not necessarily correspond to the earliest usage of a word, since words may have been in circulation in spoken language before they were first used in publications, and many old publications have simply not survived. I touched upon this issue in my answer to the question How many of Shakespeare's words in his plays were new? on Literature Stack Exchange: a number of words that are first attested in Shakespeare's plays may have circulated earlier, either in the spoken language or in writings that have not survived (or both).
Before there were digital corpora and digital texts, lexicographers had to read physical books to find usage examples. The editors of the first edition of the Oxford English Dictionary (which started their work in 1857 or 1858) asked thousands of volunteer readers to submit usage examples, in what was an early example of crowdsourcing. (See the crowdsourcing timeline, which says that 800 volunteers contributed to the first fascicle alone.)
See also the description of the OED's Reading Programme. James A. H. Murray set this up in 1879; its focus was not specifically on finding the earliest examples of words and phrases. In addition, the first edition "relied heavily on a small number of authors (notably, of course, Shakespeare) for its coverage of Early Modern English (1500-1700)". This selection was possibly too narrow to find the earliest surviving examples of words and phrases. For this reason, the current Historical Reading Programme looks at a broader spectrum of texts:
Today, readers systematically survey a much broader spectrum of texts from this and other periods. A separate Historical Reading Programme has been created to serve this function. Readers participating in this programme supply material specifically for the revision of the OED; usually these are earlier examples of words or phrases that are already included in the Dictionary. These readers check their findings against the Oxford English Dictionary to ensure that only significant items are filed, such as earlier examples of words and meanings or terms that have not yet been registered in the database.
In short, before the digital era, finding early usage examples required a lot of reading.
add a comment |
Typically, you don't ever really know for certain that you have the earliest example. Or even the earliest written example. It's just the best so far.
(As a person who frequently writes answers to etymology questions on ELU, I try to make this clear. "According to the OED", "according to my own research", "dates at least back to X" are all things I say, but I sometimes get sloppy and don't do this all the time.)
There are some exceptions. We can be very certain that "cromulent", for example, was coined in 1996 (or '95 depending on when the episode of the Simpsons was written).
How etymological research is done has varied through time. In the case of the "New English Dictionary" (the first edition of the Oxford English Dictionary), work started on it in 1857. Then:
[I]n January 1859, the Society issued their 'Proposal for the publication of a New English Dictionary,' in which the characteristics of the proposed work were explained, and an appeal made to the English and American public to assist in collecting the raw materials for the work, these materials consisting of quotations illustrating the use of English words by all writers of all ages and in all senses, each quotation being made on a uniform plan on a half-sheet of notepaper, that they might in due course be arranged and classified alphabetically and by meanings. This Appeal met with a generous response: some hundreds of volunteers began to read books, make quotations, and send in their slips to 'sub-editors,' who volunteered each to take charge of a letter or part of one, and by whom the slips were in tum further arranged, classified, and to some extent used as the basis of definitions and skeleton schemes of the meanings of words in preparation for the Dictionary.
An Appeal to the English-Speaking and English-Reading Public to Read Books and Make Extracts for The Philological Society's New English Dictionary
The "Reading Programme" is still used by the OED, although the methodology is different. The books are still read all the same but here's what happens next according to a freelance researcher for the OED:
I then consult OED Online to determine whether the word or phrase is in the Dictionary: if it is not, I submit it as a ‘not-in’, and if it is, I decide whether its form or context is important enough to warrant its submission. If it does qualify, I enter the information into tagged fields in an electronic file that has been set up in a standard format. When I have finished the reading, I submit the file to Oxford or New York, where the records are incorporated into OED‘s working database for consideration by the editors, along with thousands of paper citation slips, as they proceed through the current revision. Yes, some of my finds are still submitted as paper slips—a reminder of OED‘s long heritage—but, electronic or paper, I can hardly imagine a better job.
The quotations were collected in a machine readable format for the first time in 1989. The 1990 UK Reading Programme captured material electronically. (Note that the second edition of the Oxford English Dictionary came out in 1989.)
In addition to this, the OED now utilizes several online databases of texts, such as Early English Books Online, Eighteenth Century Collections Online, and some newspaper databases.
If you do your own research with databases (many people use the free Google Books), it's often easy to beat pages that haven't been updated for the third edition of the OED. Updates to the OED3 started in 2000 and continue to this day: it's a huge dictionary and updating takes time.
See also:
- OED: Researching the Language
I'm asking specifically because gbooks is not satisfying, not the least because OCR fails for old books. I take it that "you don't ever really know for certain" counts for lost copies, as well as otherwise unread/not-scanned books. "hundreds of volunteers" pale in comparison to the amount of available books.
– vectory
16 hours ago
add a comment |
It is not uncommon to detect word usage that predates the earliest attestation noted in the OED, for example in the Corpus of Early English Correspondence such earlier attestations were found.
Of course, when really every written document from the past is digitised this process comes to an end unless some old writings are newly discovered.
... or if the digitization is faulty
– vectory
16 hours ago
Given enough time and workforce there will be perfect or near-perfect digitisations of everything and not just mdeiOCRe OCR'ed scans. Of course there is some leeway of different readings or reconstructions of partially destroyed texts.
– jknappen
16 hours ago
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "312"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2flinguistics.stackexchange.com%2fquestions%2f30661%2fhow-do-dictionaries-source-attestation%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
I agree with your assumption that the date of the earliest recorded usage of a word does not necessarily correspond to the earliest usage of a word, since words may have been in circulation in spoken language before they were first used in publications, and many old publications have simply not survived. I touched upon this issue in my answer to the question How many of Shakespeare's words in his plays were new? on Literature Stack Exchange: a number of words that are first attested in Shakespeare's plays may have circulated earlier, either in the spoken language or in writings that have not survived (or both).
Before there were digital corpora and digital texts, lexicographers had to read physical books to find usage examples. The editors of the first edition of the Oxford English Dictionary (which started their work in 1857 or 1858) asked thousands of volunteer readers to submit usage examples, in what was an early example of crowdsourcing. (See the crowdsourcing timeline, which says that 800 volunteers contributed to the first fascicle alone.)
See also the description of the OED's Reading Programme. James A. H. Murray set this up in 1879; its focus was not specifically on finding the earliest examples of words and phrases. In addition, the first edition "relied heavily on a small number of authors (notably, of course, Shakespeare) for its coverage of Early Modern English (1500-1700)". This selection was possibly too narrow to find the earliest surviving examples of words and phrases. For this reason, the current Historical Reading Programme looks at a broader spectrum of texts:
Today, readers systematically survey a much broader spectrum of texts from this and other periods. A separate Historical Reading Programme has been created to serve this function. Readers participating in this programme supply material specifically for the revision of the OED; usually these are earlier examples of words or phrases that are already included in the Dictionary. These readers check their findings against the Oxford English Dictionary to ensure that only significant items are filed, such as earlier examples of words and meanings or terms that have not yet been registered in the database.
In short, before the digital era, finding early usage examples required a lot of reading.
add a comment |
I agree with your assumption that the date of the earliest recorded usage of a word does not necessarily correspond to the earliest usage of a word, since words may have been in circulation in spoken language before they were first used in publications, and many old publications have simply not survived. I touched upon this issue in my answer to the question How many of Shakespeare's words in his plays were new? on Literature Stack Exchange: a number of words that are first attested in Shakespeare's plays may have circulated earlier, either in the spoken language or in writings that have not survived (or both).
Before there were digital corpora and digital texts, lexicographers had to read physical books to find usage examples. The editors of the first edition of the Oxford English Dictionary (which started their work in 1857 or 1858) asked thousands of volunteer readers to submit usage examples, in what was an early example of crowdsourcing. (See the crowdsourcing timeline, which says that 800 volunteers contributed to the first fascicle alone.)
See also the description of the OED's Reading Programme. James A. H. Murray set this up in 1879; its focus was not specifically on finding the earliest examples of words and phrases. In addition, the first edition "relied heavily on a small number of authors (notably, of course, Shakespeare) for its coverage of Early Modern English (1500-1700)". This selection was possibly too narrow to find the earliest surviving examples of words and phrases. For this reason, the current Historical Reading Programme looks at a broader spectrum of texts:
Today, readers systematically survey a much broader spectrum of texts from this and other periods. A separate Historical Reading Programme has been created to serve this function. Readers participating in this programme supply material specifically for the revision of the OED; usually these are earlier examples of words or phrases that are already included in the Dictionary. These readers check their findings against the Oxford English Dictionary to ensure that only significant items are filed, such as earlier examples of words and meanings or terms that have not yet been registered in the database.
In short, before the digital era, finding early usage examples required a lot of reading.
add a comment |
I agree with your assumption that the date of the earliest recorded usage of a word does not necessarily correspond to the earliest usage of a word, since words may have been in circulation in spoken language before they were first used in publications, and many old publications have simply not survived. I touched upon this issue in my answer to the question How many of Shakespeare's words in his plays were new? on Literature Stack Exchange: a number of words that are first attested in Shakespeare's plays may have circulated earlier, either in the spoken language or in writings that have not survived (or both).
Before there were digital corpora and digital texts, lexicographers had to read physical books to find usage examples. The editors of the first edition of the Oxford English Dictionary (which started their work in 1857 or 1858) asked thousands of volunteer readers to submit usage examples, in what was an early example of crowdsourcing. (See the crowdsourcing timeline, which says that 800 volunteers contributed to the first fascicle alone.)
See also the description of the OED's Reading Programme. James A. H. Murray set this up in 1879; its focus was not specifically on finding the earliest examples of words and phrases. In addition, the first edition "relied heavily on a small number of authors (notably, of course, Shakespeare) for its coverage of Early Modern English (1500-1700)". This selection was possibly too narrow to find the earliest surviving examples of words and phrases. For this reason, the current Historical Reading Programme looks at a broader spectrum of texts:
Today, readers systematically survey a much broader spectrum of texts from this and other periods. A separate Historical Reading Programme has been created to serve this function. Readers participating in this programme supply material specifically for the revision of the OED; usually these are earlier examples of words or phrases that are already included in the Dictionary. These readers check their findings against the Oxford English Dictionary to ensure that only significant items are filed, such as earlier examples of words and meanings or terms that have not yet been registered in the database.
In short, before the digital era, finding early usage examples required a lot of reading.
I agree with your assumption that the date of the earliest recorded usage of a word does not necessarily correspond to the earliest usage of a word, since words may have been in circulation in spoken language before they were first used in publications, and many old publications have simply not survived. I touched upon this issue in my answer to the question How many of Shakespeare's words in his plays were new? on Literature Stack Exchange: a number of words that are first attested in Shakespeare's plays may have circulated earlier, either in the spoken language or in writings that have not survived (or both).
Before there were digital corpora and digital texts, lexicographers had to read physical books to find usage examples. The editors of the first edition of the Oxford English Dictionary (which started their work in 1857 or 1858) asked thousands of volunteer readers to submit usage examples, in what was an early example of crowdsourcing. (See the crowdsourcing timeline, which says that 800 volunteers contributed to the first fascicle alone.)
See also the description of the OED's Reading Programme. James A. H. Murray set this up in 1879; its focus was not specifically on finding the earliest examples of words and phrases. In addition, the first edition "relied heavily on a small number of authors (notably, of course, Shakespeare) for its coverage of Early Modern English (1500-1700)". This selection was possibly too narrow to find the earliest surviving examples of words and phrases. For this reason, the current Historical Reading Programme looks at a broader spectrum of texts:
Today, readers systematically survey a much broader spectrum of texts from this and other periods. A separate Historical Reading Programme has been created to serve this function. Readers participating in this programme supply material specifically for the revision of the OED; usually these are earlier examples of words or phrases that are already included in the Dictionary. These readers check their findings against the Oxford English Dictionary to ensure that only significant items are filed, such as earlier examples of words and meanings or terms that have not yet been registered in the database.
In short, before the digital era, finding early usage examples required a lot of reading.
edited 17 hours ago
answered 20 hours ago
Christophe StrobbeChristophe Strobbe
4031312
4031312
add a comment |
add a comment |
Typically, you don't ever really know for certain that you have the earliest example. Or even the earliest written example. It's just the best so far.
(As a person who frequently writes answers to etymology questions on ELU, I try to make this clear. "According to the OED", "according to my own research", "dates at least back to X" are all things I say, but I sometimes get sloppy and don't do this all the time.)
There are some exceptions. We can be very certain that "cromulent", for example, was coined in 1996 (or '95 depending on when the episode of the Simpsons was written).
How etymological research is done has varied through time. In the case of the "New English Dictionary" (the first edition of the Oxford English Dictionary), work started on it in 1857. Then:
[I]n January 1859, the Society issued their 'Proposal for the publication of a New English Dictionary,' in which the characteristics of the proposed work were explained, and an appeal made to the English and American public to assist in collecting the raw materials for the work, these materials consisting of quotations illustrating the use of English words by all writers of all ages and in all senses, each quotation being made on a uniform plan on a half-sheet of notepaper, that they might in due course be arranged and classified alphabetically and by meanings. This Appeal met with a generous response: some hundreds of volunteers began to read books, make quotations, and send in their slips to 'sub-editors,' who volunteered each to take charge of a letter or part of one, and by whom the slips were in tum further arranged, classified, and to some extent used as the basis of definitions and skeleton schemes of the meanings of words in preparation for the Dictionary.
An Appeal to the English-Speaking and English-Reading Public to Read Books and Make Extracts for The Philological Society's New English Dictionary
The "Reading Programme" is still used by the OED, although the methodology is different. The books are still read all the same but here's what happens next according to a freelance researcher for the OED:
I then consult OED Online to determine whether the word or phrase is in the Dictionary: if it is not, I submit it as a ‘not-in’, and if it is, I decide whether its form or context is important enough to warrant its submission. If it does qualify, I enter the information into tagged fields in an electronic file that has been set up in a standard format. When I have finished the reading, I submit the file to Oxford or New York, where the records are incorporated into OED‘s working database for consideration by the editors, along with thousands of paper citation slips, as they proceed through the current revision. Yes, some of my finds are still submitted as paper slips—a reminder of OED‘s long heritage—but, electronic or paper, I can hardly imagine a better job.
The quotations were collected in a machine readable format for the first time in 1989. The 1990 UK Reading Programme captured material electronically. (Note that the second edition of the Oxford English Dictionary came out in 1989.)
In addition to this, the OED now utilizes several online databases of texts, such as Early English Books Online, Eighteenth Century Collections Online, and some newspaper databases.
If you do your own research with databases (many people use the free Google Books), it's often easy to beat pages that haven't been updated for the third edition of the OED. Updates to the OED3 started in 2000 and continue to this day: it's a huge dictionary and updating takes time.
See also:
- OED: Researching the Language
I'm asking specifically because gbooks is not satisfying, not the least because OCR fails for old books. I take it that "you don't ever really know for certain" counts for lost copies, as well as otherwise unread/not-scanned books. "hundreds of volunteers" pale in comparison to the amount of available books.
– vectory
16 hours ago
add a comment |
Typically, you don't ever really know for certain that you have the earliest example. Or even the earliest written example. It's just the best so far.
(As a person who frequently writes answers to etymology questions on ELU, I try to make this clear. "According to the OED", "according to my own research", "dates at least back to X" are all things I say, but I sometimes get sloppy and don't do this all the time.)
There are some exceptions. We can be very certain that "cromulent", for example, was coined in 1996 (or '95 depending on when the episode of the Simpsons was written).
How etymological research is done has varied through time. In the case of the "New English Dictionary" (the first edition of the Oxford English Dictionary), work started on it in 1857. Then:
[I]n January 1859, the Society issued their 'Proposal for the publication of a New English Dictionary,' in which the characteristics of the proposed work were explained, and an appeal made to the English and American public to assist in collecting the raw materials for the work, these materials consisting of quotations illustrating the use of English words by all writers of all ages and in all senses, each quotation being made on a uniform plan on a half-sheet of notepaper, that they might in due course be arranged and classified alphabetically and by meanings. This Appeal met with a generous response: some hundreds of volunteers began to read books, make quotations, and send in their slips to 'sub-editors,' who volunteered each to take charge of a letter or part of one, and by whom the slips were in tum further arranged, classified, and to some extent used as the basis of definitions and skeleton schemes of the meanings of words in preparation for the Dictionary.
An Appeal to the English-Speaking and English-Reading Public to Read Books and Make Extracts for The Philological Society's New English Dictionary
The "Reading Programme" is still used by the OED, although the methodology is different. The books are still read all the same but here's what happens next according to a freelance researcher for the OED:
I then consult OED Online to determine whether the word or phrase is in the Dictionary: if it is not, I submit it as a ‘not-in’, and if it is, I decide whether its form or context is important enough to warrant its submission. If it does qualify, I enter the information into tagged fields in an electronic file that has been set up in a standard format. When I have finished the reading, I submit the file to Oxford or New York, where the records are incorporated into OED‘s working database for consideration by the editors, along with thousands of paper citation slips, as they proceed through the current revision. Yes, some of my finds are still submitted as paper slips—a reminder of OED‘s long heritage—but, electronic or paper, I can hardly imagine a better job.
The quotations were collected in a machine readable format for the first time in 1989. The 1990 UK Reading Programme captured material electronically. (Note that the second edition of the Oxford English Dictionary came out in 1989.)
In addition to this, the OED now utilizes several online databases of texts, such as Early English Books Online, Eighteenth Century Collections Online, and some newspaper databases.
If you do your own research with databases (many people use the free Google Books), it's often easy to beat pages that haven't been updated for the third edition of the OED. Updates to the OED3 started in 2000 and continue to this day: it's a huge dictionary and updating takes time.
See also:
- OED: Researching the Language
I'm asking specifically because gbooks is not satisfying, not the least because OCR fails for old books. I take it that "you don't ever really know for certain" counts for lost copies, as well as otherwise unread/not-scanned books. "hundreds of volunteers" pale in comparison to the amount of available books.
– vectory
16 hours ago
add a comment |
Typically, you don't ever really know for certain that you have the earliest example. Or even the earliest written example. It's just the best so far.
(As a person who frequently writes answers to etymology questions on ELU, I try to make this clear. "According to the OED", "according to my own research", "dates at least back to X" are all things I say, but I sometimes get sloppy and don't do this all the time.)
There are some exceptions. We can be very certain that "cromulent", for example, was coined in 1996 (or '95 depending on when the episode of the Simpsons was written).
How etymological research is done has varied through time. In the case of the "New English Dictionary" (the first edition of the Oxford English Dictionary), work started on it in 1857. Then:
[I]n January 1859, the Society issued their 'Proposal for the publication of a New English Dictionary,' in which the characteristics of the proposed work were explained, and an appeal made to the English and American public to assist in collecting the raw materials for the work, these materials consisting of quotations illustrating the use of English words by all writers of all ages and in all senses, each quotation being made on a uniform plan on a half-sheet of notepaper, that they might in due course be arranged and classified alphabetically and by meanings. This Appeal met with a generous response: some hundreds of volunteers began to read books, make quotations, and send in their slips to 'sub-editors,' who volunteered each to take charge of a letter or part of one, and by whom the slips were in tum further arranged, classified, and to some extent used as the basis of definitions and skeleton schemes of the meanings of words in preparation for the Dictionary.
An Appeal to the English-Speaking and English-Reading Public to Read Books and Make Extracts for The Philological Society's New English Dictionary
The "Reading Programme" is still used by the OED, although the methodology is different. The books are still read all the same but here's what happens next according to a freelance researcher for the OED:
I then consult OED Online to determine whether the word or phrase is in the Dictionary: if it is not, I submit it as a ‘not-in’, and if it is, I decide whether its form or context is important enough to warrant its submission. If it does qualify, I enter the information into tagged fields in an electronic file that has been set up in a standard format. When I have finished the reading, I submit the file to Oxford or New York, where the records are incorporated into OED‘s working database for consideration by the editors, along with thousands of paper citation slips, as they proceed through the current revision. Yes, some of my finds are still submitted as paper slips—a reminder of OED‘s long heritage—but, electronic or paper, I can hardly imagine a better job.
The quotations were collected in a machine readable format for the first time in 1989. The 1990 UK Reading Programme captured material electronically. (Note that the second edition of the Oxford English Dictionary came out in 1989.)
In addition to this, the OED now utilizes several online databases of texts, such as Early English Books Online, Eighteenth Century Collections Online, and some newspaper databases.
If you do your own research with databases (many people use the free Google Books), it's often easy to beat pages that haven't been updated for the third edition of the OED. Updates to the OED3 started in 2000 and continue to this day: it's a huge dictionary and updating takes time.
See also:
- OED: Researching the Language
Typically, you don't ever really know for certain that you have the earliest example. Or even the earliest written example. It's just the best so far.
(As a person who frequently writes answers to etymology questions on ELU, I try to make this clear. "According to the OED", "according to my own research", "dates at least back to X" are all things I say, but I sometimes get sloppy and don't do this all the time.)
There are some exceptions. We can be very certain that "cromulent", for example, was coined in 1996 (or '95 depending on when the episode of the Simpsons was written).
How etymological research is done has varied through time. In the case of the "New English Dictionary" (the first edition of the Oxford English Dictionary), work started on it in 1857. Then:
[I]n January 1859, the Society issued their 'Proposal for the publication of a New English Dictionary,' in which the characteristics of the proposed work were explained, and an appeal made to the English and American public to assist in collecting the raw materials for the work, these materials consisting of quotations illustrating the use of English words by all writers of all ages and in all senses, each quotation being made on a uniform plan on a half-sheet of notepaper, that they might in due course be arranged and classified alphabetically and by meanings. This Appeal met with a generous response: some hundreds of volunteers began to read books, make quotations, and send in their slips to 'sub-editors,' who volunteered each to take charge of a letter or part of one, and by whom the slips were in tum further arranged, classified, and to some extent used as the basis of definitions and skeleton schemes of the meanings of words in preparation for the Dictionary.
An Appeal to the English-Speaking and English-Reading Public to Read Books and Make Extracts for The Philological Society's New English Dictionary
The "Reading Programme" is still used by the OED, although the methodology is different. The books are still read all the same but here's what happens next according to a freelance researcher for the OED:
I then consult OED Online to determine whether the word or phrase is in the Dictionary: if it is not, I submit it as a ‘not-in’, and if it is, I decide whether its form or context is important enough to warrant its submission. If it does qualify, I enter the information into tagged fields in an electronic file that has been set up in a standard format. When I have finished the reading, I submit the file to Oxford or New York, where the records are incorporated into OED‘s working database for consideration by the editors, along with thousands of paper citation slips, as they proceed through the current revision. Yes, some of my finds are still submitted as paper slips—a reminder of OED‘s long heritage—but, electronic or paper, I can hardly imagine a better job.
The quotations were collected in a machine readable format for the first time in 1989. The 1990 UK Reading Programme captured material electronically. (Note that the second edition of the Oxford English Dictionary came out in 1989.)
In addition to this, the OED now utilizes several online databases of texts, such as Early English Books Online, Eighteenth Century Collections Online, and some newspaper databases.
If you do your own research with databases (many people use the free Google Books), it's often easy to beat pages that haven't been updated for the third edition of the OED. Updates to the OED3 started in 2000 and continue to this day: it's a huge dictionary and updating takes time.
See also:
- OED: Researching the Language
answered 19 hours ago
LaurelLaurel
3115
3115
I'm asking specifically because gbooks is not satisfying, not the least because OCR fails for old books. I take it that "you don't ever really know for certain" counts for lost copies, as well as otherwise unread/not-scanned books. "hundreds of volunteers" pale in comparison to the amount of available books.
– vectory
16 hours ago
add a comment |
I'm asking specifically because gbooks is not satisfying, not the least because OCR fails for old books. I take it that "you don't ever really know for certain" counts for lost copies, as well as otherwise unread/not-scanned books. "hundreds of volunteers" pale in comparison to the amount of available books.
– vectory
16 hours ago
I'm asking specifically because gbooks is not satisfying, not the least because OCR fails for old books. I take it that "you don't ever really know for certain" counts for lost copies, as well as otherwise unread/not-scanned books. "hundreds of volunteers" pale in comparison to the amount of available books.
– vectory
16 hours ago
I'm asking specifically because gbooks is not satisfying, not the least because OCR fails for old books. I take it that "you don't ever really know for certain" counts for lost copies, as well as otherwise unread/not-scanned books. "hundreds of volunteers" pale in comparison to the amount of available books.
– vectory
16 hours ago
add a comment |
It is not uncommon to detect word usage that predates the earliest attestation noted in the OED, for example in the Corpus of Early English Correspondence such earlier attestations were found.
Of course, when really every written document from the past is digitised this process comes to an end unless some old writings are newly discovered.
... or if the digitization is faulty
– vectory
16 hours ago
Given enough time and workforce there will be perfect or near-perfect digitisations of everything and not just mdeiOCRe OCR'ed scans. Of course there is some leeway of different readings or reconstructions of partially destroyed texts.
– jknappen
16 hours ago
add a comment |
It is not uncommon to detect word usage that predates the earliest attestation noted in the OED, for example in the Corpus of Early English Correspondence such earlier attestations were found.
Of course, when really every written document from the past is digitised this process comes to an end unless some old writings are newly discovered.
... or if the digitization is faulty
– vectory
16 hours ago
Given enough time and workforce there will be perfect or near-perfect digitisations of everything and not just mdeiOCRe OCR'ed scans. Of course there is some leeway of different readings or reconstructions of partially destroyed texts.
– jknappen
16 hours ago
add a comment |
It is not uncommon to detect word usage that predates the earliest attestation noted in the OED, for example in the Corpus of Early English Correspondence such earlier attestations were found.
Of course, when really every written document from the past is digitised this process comes to an end unless some old writings are newly discovered.
It is not uncommon to detect word usage that predates the earliest attestation noted in the OED, for example in the Corpus of Early English Correspondence such earlier attestations were found.
Of course, when really every written document from the past is digitised this process comes to an end unless some old writings are newly discovered.
answered 16 hours ago
jknappenjknappen
11.4k22853
11.4k22853
... or if the digitization is faulty
– vectory
16 hours ago
Given enough time and workforce there will be perfect or near-perfect digitisations of everything and not just mdeiOCRe OCR'ed scans. Of course there is some leeway of different readings or reconstructions of partially destroyed texts.
– jknappen
16 hours ago
add a comment |
... or if the digitization is faulty
– vectory
16 hours ago
Given enough time and workforce there will be perfect or near-perfect digitisations of everything and not just mdeiOCRe OCR'ed scans. Of course there is some leeway of different readings or reconstructions of partially destroyed texts.
– jknappen
16 hours ago
... or if the digitization is faulty
– vectory
16 hours ago
... or if the digitization is faulty
– vectory
16 hours ago
Given enough time and workforce there will be perfect or near-perfect digitisations of everything and not just mdeiOCRe OCR'ed scans. Of course there is some leeway of different readings or reconstructions of partially destroyed texts.
– jknappen
16 hours ago
Given enough time and workforce there will be perfect or near-perfect digitisations of everything and not just mdeiOCRe OCR'ed scans. Of course there is some leeway of different readings or reconstructions of partially destroyed texts.
– jknappen
16 hours ago
add a comment |
Thanks for contributing an answer to Linguistics Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2flinguistics.stackexchange.com%2fquestions%2f30661%2fhow-do-dictionaries-source-attestation%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Help, I don't know which answer to mark, they are both nearly identical.
– vectory
16 hours ago