The Demopædia Encyclopedia on Population is under heavy modernization and maintenance. Outputs could look bizarre, sorry for the temporary inconvenience
Multilingual Demographic Dictionary, second unified edition, English volume
|Disclaimer : The sponsors of Demopaedia do not necessarily agree with all the definitions contained in this version of the Dictionary.|
The harmonization of all the second editions of the Multilingual Demographic Dictionary is an ongoing process. Please consult the discussion area of this page for further comments.
Go to: Introduction to Demopædia | Instructions on use | Downloads
The process of obtaining statistical data from documents not primarily designed for this purpose is called extraction 1. In general, whatever its source, statistical information is subjected to processing 2 which may be manual 3, mechanical 4, electronic 5 or a combination of these modes. Manual processing involves no equipment more complex than the desk calculator 6. Electronic processing uses computers (132-2*). Regardless of the mode of processing, certain types of operations 7 must be performed including editing 8 of the data, tabulation (130-6*) and calculation (132-2) and table preparation 9. These operations are made more or less complex depending on the mode of processing which is selected.
- 1. Extraction, n. - extract, v.
- 2. Processing, n. - process, v. The terms to process information, data processing, are used widely.
- 8. Editing, n. - edit, v. In English, the term refers to an operation performed either on the basic document or on the machine - readable data, to correct inconsistencies or eliminate omissions. In French, edition refers to the stage of table preparation.
Editing the data usually requires the prior coding 1 of a certain number of entries on the basic document 2. The coding scheme 3 establishes a correspondence between an entry and its translation into numeric or alphabetic codes. The code book collects and describes the coding schemes used with a particular set of basic documents. A coding scheme is usually designed to facilitate later groupings of the data. In contrast, a classification 4 is a mere list of individual codes where each heading 5 is given one or several numbers. After the data have been coded, they constitute a file (213-3*) which can be converted to machine readable form. The second stage in the editing consists in the cleaning 6 of the file, through elimination of errors by validity checks 7 and consistency checks 7; these can be internal checks within each statistical unit (cf. 110-1) or may result from the comparison of different units. After errors have been identified, they may be corrected in the original document or the file by some automatic procedure.
- 1. Code, n. - code, v. - coding, n.
The edited data are rarely used directly; they are subjected to grouping (130-7) and tabulation (130-6*), and this normally leads to a presentation in the form of statistical tables (131-4). These may be the outcome of sorting 1, either manual or mechanical, resulting in the reorganization of the elements in a set according to predetermined rules, or more simply of a systematic count of the elements presenting a selected characteristic. The choice of elements or of characteristics may be based on the values of one or several quantitative attributes, or on the modalities 2 of one or several qualitative attributes. Few studies can do without computation, simple or complex, isolated or repetitive, and the computer (225-2) now allows calculation that would have been too lengthy by hand. These capabilities have led to the development of techniques of data analysis 3. Deterministic and stochastic models (cf. 730) often require considerable computations, and so do simulations (730-6).
- 1. Sorting, n. - sort, v.
The stage of table preparation (220-9) aims at making the results of processing conveniently available in the form of listings 1, numerical tables (131-4) or charts (155-2), all of which are commonly used in descriptive statistics 2. The use of computer graphing 3 and computer cartography 3 permits the mass production of graphical presentation.
Purely mechanical processing (220-4) did not involve the use of electronic equipment 1 which has come to replace the earlier tabulating machines 2 or unit record machines 2 and is much more versatile.
Demographic research depends heavily on electronic data processing 1 using the computer 2. The term hardware 3 refers to the physical component, whereas software 4 supplies the user 5 with ways to have access to the computer. Computer specialists 6 include programmers 7 who write programs 8 conceived by system analysts 9.
The hardware (225-3) components of a computer (225-2) include one or several central processing units 1, a central memory 2, one or more mass storage devices 3 which use magnetic tapes 4 or disks 5 and a set of input-output devices 6. The software (225-4) components include the operating system 7, which has the task of efficiently managing the available facilities 8 for the users (225-5) running the users’ programs 9 and the processing programs 10 which are preestablished programs (225-8) designed for the solution of standard problems.
A user (225-5) can process his problem by writing a program (225-8) in a general programming language 1 such as Fortran, Cobol, Basic or Algol, or a specific language, designed to use the processing programs (226-9) stored in the central memory (226-2) of the computer such as a data base management system 2 used to create and maintain a data bank 2, a survey processing program 3 or a statistical package 4. The devices which are used to enter and receive information from the computer can differ according to the mode of processing. In batch processing 7, the normal input and output units are the card reader 5 and the line printer 6. A console 8 is the normal input and output unit for processing in a timesharing mode 9. In either instance the entry units may be spatially separated from the computer and processing under these conditions is accomplished by remote terminal 10.
- 1. In addition to programming languages as defined above, other types of languages can be used to manipulate the operating system; these are usually referred to as job control language.
Any information processed in a computer (225-2) undergoes three main phases. First, data entry 1 or input 1 which may be done by using an on line 2 device such as a keyboard console (227-8). Data which is already stored in the computer may be accessed from either central memory (226-2) or from one of the mass storage devices (226-3) and used as input data. This is part of the data collection 3★ which goes from extraction (220-1) to the transcription on an electronic medium, through validity checks (221-7) and consistency checks (221-18 that can be made during data entry when working on line. The second phase, processing (220-2), is divided into two main types: numerical processing 4 and non-numerical processing 5. Statistical or arithmetic computations are normally the operations contained in the former while data manipulation operations are the focus in the latter. In a third phase, occasionally referred to as output phase, the processed results 6 or output 6 may be printed out on the line printer (227-6) or saved as a file on a mass storage device (226-3) for further processing. Results may also be diverted to a plotter 7 to obtain processed results in the form of a graph or a figure.
* * *
Go to: Introduction to Demopædia | Instructions on use | Downloads