The Demopædia Encyclopedia on Population is under heavy modernization and maintenance. Outputs could look bizarre, sorry for the temporary inconvenience
Multilingual Demographic Dictionary, second unified edition, English volume
13
Disclaimer : The sponsors of Demopaedia do not necessarily agree with all the definitions contained in this version of the Dictionary. The harmonization of all the second editions of the Multilingual Demographic Dictionary is an ongoing process. Please consult the discussion area of this page for further comments. |
Go to: Introduction to Demopædia | Instructions on use | Downloads |
130
The terms population statistics ^{1} or demographic statistics ^{1} refer to numerical data ^{2} about populations, which are based on observations ^{3} . After such observations have been collected ^{4} on appropriate forms (206-1), the documents are edited ^{5} and verified ^{5} to eliminate obvious inconsistencies. The data are tabulated ^{6} into certain groups ^{7} or classes ^{8} with common characteristics. Data processing ^{9} includes all the steps between collecting and statistical analysis ^{10★} (132-1).
- 1. Statistics, n. - statistical, adj. - statistician, n.: specialist in statistics.
- 4. Collect, v. - collection, n.
- 5. Edit, v. - editing, n. Verify, v. - verification, n.
- 6. Tabulate, v. - tabulation, n.
- 9. Process, v. - processing, n.
131
Data are usually referred to as raw data ^{1} or crude data ^{1} prior to their processing and tabulation and basic data ^{1} or primary data ^{1} after processing and tabulation. Basic data usually consist of series ^{2} of absolute numbers ^{3} which are put together in the form of statistical tables ^{4} . In such tables the data are generally classified with respect to certain variables ^{5} or variates ^{5} such as age, number of children, etc., or with respect to certain attributes ^{6} or characteristics ^{6} (i.e. sex, marital status, etc.). Where data are classified with respect to several variables or attributes simultaneously the tables are called cross-tabulations ^{7} or contingency tables ^{7} . Summary tables ^{8} give information in less detail than do individual tables ^{9} .
- 1. When the data relate to individuals (110-2) as their unit of analysis they may be referred to as micro-data. Aggregate data or macro-data relate to a unit of analysis other than an individual, for example, a nation or an administrative unit within a nation. Micro-data can be derived from several sources such as a field survey (203-5) or a sample of vital registration records. A new source of micro-data is the census public use sample, which is a systematic or a random sample of census returns that is made available for analytical purposes to interested individuals.
- 7. A table which presents the distribution of a single variable or attribute within a population is generally called a frequency table.
132
Using the basic data generally involves two phases. Analysis ^{1} aims at isolating the components of the observed numbers (size, structure, extraneous factors and the phenomenon under investigation); synthesis ^{2} is the process of recombining the disaggregated components in various ways. Either phase involves the calculation ^{3} or computation ^{3} of indices ^{4} which may be denoted by various names (cf. § 133). In contrast to the basic data, these indices are referred to as results ^{6} or synthetic indices ^{5★}. In a more restricted sense an index ^{7} (pl. indexes or indices) or index number ^{7} is a ratio showing the value of a given quantity relative to a base ^{8}, which is usually taken as 100. Some indices are good indicators ^{9} of a complex situation; thus the infant mortality rate is sometimes used as an indicator of the health status of a population.
- 1. Analysis, n. - analytical, adj. - analyze, v.
- 3. Calculate, b. - calculation, n. - calculator, n.: a machine with minimal to modest data storage capabilities designed to facilitate a modest amount of arithmetic and statistical calculating.
Compute, v. - computation, n. - computer, n.: a machine system designed to effect the transmission, storage and calculations of large data sets; it permits arithmetic and statistical calculations as well as logical processing of data. In a dated sense, the terms calculator and computer were used to designate the person(s) engaged in the computations.
133
One of the first stages of analysis (132-1) consists of relating the population totals or number of events to other totals or numbers. The resulting indices are given various names. A ratio ^{6★}, also used for various purposes, is the quotient obtained by dividing quantities of the same kind. When the dividend and divisor belong to the same kind but different categories (men and women, children and women, different age-groups, for example) an other terminology might be used in non English languages, relating both quantities with a specific ratio ^{1} (like a sex ratio). A proportion ^{2} is a ratio which indicates the relation in magnitude of a part to the whole. A percentage ^{3} is a proportion expressed per hundred. A rate ^{4} is a special type of ratio used to indicate the relative frequency ^{5} of the occurrence of a particular event within a population or a sub-population in a specified period of time, usually one year. Although this usage is recommended, the term has steadily acquired a wider meaning and is often incorrectly used as a synonym for ratio (e.g. labor force participation rate, which is actually a proportion).
- 2. Proportion, n. - proportional, adj.
- 4. Rates are generally given per thousand, and where the term "rate" is used without additional qualification "per thousand" is generally understood. Some rates, however, are given per ten thousand, per one hundred thousand, or per million e.g. cause-specific death rates (421-10). On other occasions rates may be given per person or per hundred. The word "rate" is sometimes omitted, thus one may find the expression "a mortality of ten per thousand," but this is not recommended.
- 6. The total fertility rate (cf 639-4) is the sum of age-specific fertility rates (cf 633-9) over the age reproductive period and thus lost its inverse temporal dimension (per year). The difference is as important as between length and surface or velocity and acceleration. The term synthetic index (cf 132-5) is preferred in some languages to avoid the confusion with the inverse temporal dimension (per year) of a rate: number of demographic events divided by the time exposure or person-years. If used, the term rate in the expression total fertility rate refers to the implicit per woman, which is not enough to qualify as a rate but enough for a dimensionless ratio.
134
The relative frequency (133-5) of a non-renewable event is often regarded as an empirical measure of the probability ^{1} of occurrence of that event. This presumes that all the individuals who appear in the denominator have been exposed to risk ^{3} in some way, i.e. there must have been a chance ^{2} or risk ^{2} that the event in question could happen to them. The use of the term "risk" does not imply that the event in question is in any way unwanted; thus the term "risk of marriage" is used. The population is often divided into different sub-groups, in which the risk of the event in question is less variable between individuals than in the population as a whole; the subgroup is more homogeneous ^{4} with respect to the risk than the relatively heterogeneous ^{5} whole population. Rates calculated for such subgroups are called specific rates ^{6} as opposed to crude rates (136-8) which apply to the population as a whole. General rates ^{7} sometimes involve an age restriction, as in the instance of general fertility rates (633-8).
- 1. Probability, n. - probable, adj.
- 4. Homogeneous, adj. - homogeneity, n.
- 5. Heterogeneous, adj. - heterogeneity, n.
135
Age-specific rates ^{1} are computed for single years of age or for age groups (age-group specific rate ^{2★} or age group-specific rate ^{2★}). Duration-specific rates ^{3} take into account the time elapsed since a baseline event ^{4} or event-origin ^{4} such as marriage or a previous birth. Central rates ^{10} are obtained by dividing the number of events during a year, or some other period (often five years) either by the average population ^{6} or mid-year population ^{6} or by the number of person-years ^{7} of exposure to the event in question during that year or period; the number of person-years is the sum, expressed in years, of the exposure time for all individuals in the observed group, over the year or period. The term rate is sometimes used also for another type of measure, obtained by dividing the number of non-renewable events in a year or a period of years by the size of the cohort considered at the beginning of the year or period; this measure is sometimes called an attrition probability ^{5} or more simply a probability ^{5}, and contrasted with the central rate, defined earlier. In this paragraph, the word "period" has referred to a length of time. In the expression period rates ^{8}, however, the word is used in its chronological meaning and refers to a specific calendar year or group of years; it is opposed to cohort rate ^{9} or generation rate ^{9}.
- 5. The word quotient, used in French for this type of rate, has sometimes been used in English.
136
Data are called provisional ^{1} if they are based on incomplete or insufficiently controlled observations. They are replaced by final ^{2} data when the observations are complete. Rates based on such data are called provisional rates ^{3} and final rates ^{4} respectively. Where information becomes available after figures have already been published, revised rates ^{5} may be issued. The expression corrected rate ^{6} usually implies that defective data or inappropriate methods have yielded results which are either misleading or of limited value for the purpose in hand and that an effort has been made to correct this, e.g., correction for underenumeration, correction for migration, correction for seasonal movement. Standardized rates ^{7} or adjusted rates ^{7} are designed to make it possible to compare different populations with respect to a variable, e.g. fertility or mortality, where the influence of another variable e.g. age, is held constant. The term corrected rate ^{7} has been used by some demographers as a synonym for standardized rate. When the data do not permit direct estimation of the rates (small population, for example), the use of standard rates ^{9★} (cf. 403-6 for example) computed from data of good quality and applied to the real population, provides an indirect estimation of the expected number of events which can be compared with the observed number of events. Unstandardized rates are called crude rates ^{8}. Although they may be used to measure actual trends, false inferences may result from their uncritical use when populations with different structures (144-4) are compared.
137
Demographic indices (132-7) will in most cases relate to a particular period of observation ^{1} ; this is true in particular of most rates (cf. 133-4). An annual rate ^{2} will relate to a period of twelve months. Where observations are collected for a number of years and then averaged, the term mean annual rate ^{3} or average annual rate ^{3} is often used for the result. Where rates are calculated for periods different from a year they are converted to an annual basis ^{4} through multiplication by an appropriate factor. Instantaneous rates ^{5} are sometimes computed; they relate to an infinitesimal period of time, cf. for instance the instantaneous death rate (431-4) or the instantaneous rate of population growth (702-5).
138
The primary objective of cohort analysis (103-4) is the study of the intensity ^{1} and tempo ^{2} or timing ^{2} of demographic phenomena. The intensity of a phenomenon initiated by one non-renewable event (201-4) may be measured by either the ultimate frequency ^{3} of occurrence for the given event or by its complement. The ultimate frequency reflects the proportion of persons who would have experienced the event, in the absence of extraneous influences, during the existence of the cohort (116-2). The intensity of a phenomenon initiated by a renewable event (201-5) such as births or migratory moves, can be measured by the mean number of events ^{4} per person in the cohort, also in the absence of extraneous influences. Tempo or timing may be defined as the distribution over time within the cohort of the demographic events corresponding to the investigated phenomenon. The results of cross-sectional analysis or period analysis (103-5) are summarized by period measures ^{5} — as opposed to cohort measures ^{6} — which can be constructed in various fashions. A commonly used technique consists in attributing the observed rates pertaining to various ages or durations to a hypothetical cohort ^{7} or synthetic cohort ^{7} .
- 3. This ultimate frequency or its complement has received various names according to the phenomenon studied: parity progression ratio (637-7), frequency of definitive celibacy (521-1) ... It is best not to use the word proportion as part of these names, and to reserve it for observed proportions. For instance, the frequency of definitive celibacy must be kept distinct from the proportion single at a given age, as recorded in a census.
- 4. It is not unusual to give the same name to the observed mean number of events per person, and to the number that would have been observed in the absence of extraneous influences such as mortality. Distinct phrases should be used; for instance, the number of children ever born (637-2) can be distinguished from cumulative fertility (636-2).
- 5. Because cross-sectional analysis and hypothetical cohorts were used before genuine cohort analysis, the names of period indices often seem to imply that they refer to a cohort. This usage may lead to apparent contradictions. For example, parity-specific birth probabilities may exceed one for certain years when many postponed births are made up.
* * *
Go to: Introduction to Demopædia | Instructions on use | Downloads |