POVERTY AND EQUITY:
MEASUREMENT, POLICY AND
ESTIMATION WITH DAD
ECONOMIC STUDIES IN INEQUALITY, SOCIAL EXCLUSION AND WELLBEING
Editor:
Jacques Silber, Bar Ilan University
Editorial Advisory Board:
John Bishop, East Carolina University, USA.
Satya Chakravarty, Indian Statistical Institute, India.
Conchita D' Ambrosio, University of MilanoBicocca, Italy.
David Gordon, University of Bristol, The United Kingdom.
Jaya Krishnakumar, University of Geneva, Switzerland.
This series will publish volumes that go beyond the traditional concepts of consumption, income or wealth and will offer a broad, inclusive view of inequality and wellbeing. Specific areas of interest will include Capabilities and Inequalities, Discrimination and Segregation in the Labor Market, Equality of Opportunities, Globalization and Inequality, Human Development and the Quality of Life, Income and Social Mobility, Inequality and Development, Inequality and Happiness, Inequality and Malnutrition, Income and Social Mobility, Inequality in Consumption and Time Use, Inequalities in Health and Education, Multidimensional Inequality and Poverty Measurement, Polarization among Children and Elderly People, Social Policy and the Welfare State, and Wealth Distribution.
Volume 1
de Janvry, Alain and Kanbur, Ravi
Poverty, Inequality and Development: Essays in Honor of Erik Thorbecke
Volume 2
Duclos, JeanYves and Araar, Abdelkrim
Poverty and Equity: Measurement, Policy and Estimation with DAD
Jointly published by
Springer
233 Spring Street
New York, NY 10013
and the
International Development Research Centre
PO Box 8500, Ottawa, ON, Canada K1G 3H9
info@idrc.ca / www.idrc.ca
Library of Congress Control Number: 2006923024
ISBN:10: 0387258930 (HB)  eISBN10: 0387333185 
© 2006 Springer Science+Business Media, LLC
All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden.
The use in this publication of trade names, trademarks, service marks and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights.
Printed in the United States of America.
9 8 7 6 5 4 3 2 1
springer.com
To MarieChantal,
Étienne, Clémence and
Antoine. Without their
love, I could not be
such a happy Dad.
Yves
To Syham, Abdou and
Aymen.
To my mother and my
father
Araar Abdelkrim
This page intentionally left blank.
The publication of this book is in large part due to the role of Canada's International Development Research Center (IDRC) in encouraging policyrelevant research in the fields of poverty and equity. The book has in particular benefitted much from IDRC's support of two significant ventures, the Micro Impacts of Macro Economic and Adjustment Programs (MIMAP) and the Poverty and Economic Policy (PEP) international research network. We are most grateful to the IDRC for their continued and inspiring dedication, professionalism and vision in the field of development research. The Secrétariat d'appui institutionnel à la recherche économique en Afrique (SISERA), the World Bank Institute and the African Economic Research Consortium (AERC) have also partially supported to the production of this book. The fundamental research was further financed by grants from the Social Sciences and Humanities Research Council of Canada (SSHRC) and from the Fonds Québécois de Recherche sur la Société et la Culture (FQRSC) of the Province of Québec.
This book is mostly targeted to senior undergraduate and graduate students in economics as well as to researchers and analytical policy makers. More generally, it is intended for social scientists and statisticians. Some of its content can also be instructive to less specialized readers, such as those in the general public wishing to introduce themselves to the challenges posed and the insights generated by distributive analysis.
The book covers a relatively wide range of material. Part I deals with some of the conceptual, methodological and empirical issues and difficulties that arise in the assessment of wellbeing and poverty. Part II presents a number of measures on poverty, inequality, social welfare, and vertical and horizontal equity. Part III considers some of the methods that can establish whether a distribution of wellbeing or a policy "dominates" another in terms of some generallydefined ethical criteria. Part IV develops tools that can be used to understand and predict how targeting, price changes, growth and fiscal policy can affect poverty and equity. Part V introduces some of the statistical techniques that can help depict the distribution of living standards and help protect against the presence of sampling errors in making poverty and equity comparisons. Part V also introduces DAD and shows how that software can be used to apply the book's measurement and statistical techniques to micro data. Part VI contains a number of exercises (with illustrative datasets) that can be used to learn to implement some of the measurement and statistical techniques described in the book.
We certainly cannot pretend the book to be a comprehensive survey of the methods used to analyze poverty and equity. There is an obvious tendency for an author's exposition of a subject to be biased in favor of the work he knows best—and thus in favor of the work most closely related to his own work. This book is a clear example of this bias. One advantage of such a bias, however, is that it tends to unify the exposition. Such a unification, we have tried to enforce as much as we could throughout the various parts of the book. This helped present in a single text a unified treatment of distributive analysis from a conceptual, methodological, policy, statistical and practical point of view.
Most of the book's footnotes refer to applications programmed in DAD. These footnotes can thus guide the reader to where to go in DAD to test and implement many of the measurement and statistical tools exposed in the book. In the margins appear the exercise numbers which can be used to learn more about the book's tools. Most of these exercises involve the use of DAD. The solutions to the exercises can be found on DAD's official web page,
The illustrative datasets used are briefly described at the end of the exercises. An index of the symbols used can be found starting on page 376. An author and a subject index are also provided at the end of the book.
To ease exposition within the main text, we endeavored to limit as much possible references to the literature, except when such references were clearly improving readability. Instead, each chapter is followed by a reference section in which the chapter's appropriate bibliographic references are mentioned and linked to each other.
This book and the accompanying software are certainly perfectible. I suppose it is the plight of all book writers to feel that their product is never satisfactorily finished. We hope to correct some of this version's shortcomings in future editions. For this, any comments on this first edition will be gratefully received.
I wish to thank my coauthors and former students, Sami Bibi, Philippe Grégoire, Vincent Jalbert, Paul Makdissi and Martin Tabi for their insights and dedication. I am also very grateful to my coauthors on distributive analysis papers — Russell Davidson, Damien Échevin, Carl Fortin, Peter Lambert, Magda Mercader, David Sahn, Steve Younger and Quentin Wodon — for their friendship and fruitful collaboration. Work at Université Laval was made productive and particularly enjoyable by the encouragement of my colleagues — among whom Bernard Decaluwé and Bernard Fortin feature prominently as former heads of CRÉFA — and more generally by the support of the Department of Economics and CIRPÉE (formerly CRÉFA). My thanks also extend to MIMAP and PEP coworkers, inter alia Touhami Abdelkhalek, LouisMarie Asselin, Dorothée Boccanfuso, John Cockburn, Anyck Dauphin, Yazid Dissou, Samuel Kaboré, Jean Bosco Ki, MarieClaude Martin, Damien Mededji, Abena Oduro, Luc Savard, Randy Spence, and to the teams of IDRC and AERC administrators and researchers with whom we have had the pleasure and privilege to work in the last decade. They provided much of the motivation and inspiration for writing this book. I am also grateful to my coauthor, Abdelkrim Araar, for the trust and dedication he put into building DAD and this book's material over the last years, despite the uncertainty that initially clouded the project. I finally wish to thank Bill Carman of IDRC and Marilea Polk Fried of SpringerKluwer for their efforts in bringing the publication of this book to full completion.
JEANYVES DUCLOS
Developing the DAD software, conducting fundamental research in distributive analysis and assisting researchers in developed countries have been my main activities for the last several years. The expertise that I have acquired in distributive analysis is the result of the continued support that I have received from JeanYves Duclos, who was also the director of my Ph.D. thesis. I am also grateful to all the researchers with whom I have worked for their collaboration and assistance.
ARAAR ABDELKRIM
This page intentionally left blank.
This page intentionally left blank.
The assessment of wellbeing for poverty analysis is traditionally characterized according to two main approaches, which, following Ravallion (1994), we will term the welfarist and the nonwelfarist approaches. The first approach tends to concentrate in practice mainly on comparisons of "economic wellbeing", which we will also call "standard of living" or "income" (for short). As we will see, this approach has strong links with traditional economic theory, and it is also widely used by economists in the operations and research work of organizations such as the World Bank, the International Monetary Fund, and Ministries of Finance and Planning of both developed and developing countries. The second approach has historically been advocated mainly by social scientists other than economists and partly in reaction to the first approach. This second approach has nevertheless also been recently and increasingly advocated by economists and noneconomists alike as a multidimensional complement to the unidimensional standard of living approach.
The welfarist approach is strongly anchored in classical microeconomics, where, in the language of economics, "welfare" or "utility" are generally key in accounting for the behavior and the wellbeing of individuals. Classical microeconomics usually postulates that individuals are rational and that they can be presumed to be the best judges of the sort of life and activities which maximize their utility and happiness. Given their initial endowments (including time, land, and physical, financial and human capital), individuals make production and consumption choices using their set of preferences over bundles of consumption and production activities, and taking into account the available production technology and the consumer and producer prices that prevail in the economy.
Under these assumptions and constraints, a process of individual and rational free choice will maximize the individuals' utility; under additional assumptions (including that markets are competitive, that agents have perfect information, and that there are no externalities — assumptions that are thus restrictive), a society of individuals all acting independently under this freedom of choice process will also lead to an outcome known as Pareto efficient, in that no one's utility could be further improved by government intervention without decreasing someone else's utility.
Underlying the welfarist approach to poverty, there is a premise that good note should be taken of the information revealed by individual behavior when it comes to assessing poverty. More precisely, the assessment of someone's wellbeing should be consistent with the ordering of preferences revealed by that person's free choices. For instance, a person could be observed to be poor by the total consumption or income standard of a poverty analyst. That same person could nevertheless be able (i.e., have the working capacity) to be nonpoor. This could be revealed by the observation of a deliberate and free choice on the part of the individual to work and consume little, when the capability to work and consume more nevertheless exists. By choosing to spend little (possibly for the benefit of greater leisure), the person reveals that he is happier than if he worked and spent more. Although he could be considered poor by the standard of a (nonwelfarist) poverty analyst, a comprehensive utility judgement would conclude that this person is not poor. As we will discuss later, this can have important implications for the design and the assessment of public policy.
A pure welfarist approach faces important practical problems. To be operational, pure welfarism requires the observation of sufficiently informative revealed preferences. For instance, for someone to be declared poor or not poor, it is not enough to know that person's current characteristics and income status: it must also be inferred from that person's actions whether he judges his utility status to be above a certain poverty utility level.
A related problem with the pure welfarist approach is the need to assess levels of utility or "psychic happiness". How are we to measure the actual pleasure derived from experiencing economic wellbeing? Moreover, it is highly problematic to attempt to compare that level of utility across individuals — it is well known that such a procedure poses serious ethical difficulties, preferences are heterogeneous, personal characteristics, needs and enjoyment abilities are diverse, households differ in size and composition, and prices vary across time and space. More generally, because economic wellbeing (in particular, utility) is typically seen as a subjective concept, most economists believe that interpersonal comparisons of economic wellbeing do not make much sense.
Supposing that these criticisms are resolved, the welfarist approach would classify as poor an individual who is materially welloff but not content, and as not poor an individual materially deprived but nevertheless content. It is not clear that we should accept as ethically significant such individual feelings of utilities. Said differently, why should a difficulttosatisfy rich person be judged less welloff than an easilycontented poor person? Or, in the words of Sen (1983), p. 160, why should a "grumbling rich" be judged "poorer" than a "contented peasant"?
Hence, welfarist comparisons of poverty almost invariably use imperfect but objectively observable proxies for utilities, such as income or consumption. The "working" definition of poverty for the welfarist approach is therefore a lack of command over commodities, measured by low income or consumption. These moneymetric indicators are often adjusted for differences in needs, prices, and household sizes and compositions, but they clearly represent farfromperfect indicators of utility and wellbeing. Indeed, economic theory tells us little about how to use consumption or income to make consistent interpersonal comparisons of wellbeing. Besides, the consumption and income proxies are rarely able to take full account of the role for wellbeing of public goods and nonmarket commodities, such as safety, liberty, peace, health. In principle, such commodities can be valued using reference or "shadow" prices. In practice, this is difficult to do accurately and consistently.
There are two major nonwelfarist approaches, the basicneeds approach and the capability approach. The first focuses on the need to attain some basic multidimensional outcomes that can be observed and monitored relatively easily. These outcomes are usually (explicitly or implicitly) linked with the concept of functionings, a concept largely developed in Amartya Sen's influential work:
Living may be seen as consisting of a set of interrelated 'functionings', consisting of beings and doings. A person's achievement in this respect can be seen as the vector of his or her functionings. The relevant functionings can vary from such elementary things as being adequately nourished, being in good health, avoiding escapable morbidity and premature mortality, etc., to more complex achievements such as being happy, having selfrespect, taking part in the life of the community, and so on (Sen 1992, p.39).
In this view, functionings can be understood to be constitutive elements of wellbeing. One lives well if he enjoys a sufficiently large level of functionings. The functioning approach would generally not attempt to compress these multidimensional elements into a single dimension such as utility or happiness. Utility or happiness is viewed as a reductive aggregate of functionings, which are multidimensional in nature. The functioning approach usually focuses instead on the attainment of multiple specific and separate outcomes, such as the enjoyment of a particular type of commodity consumption, being healthy, literate, wellclothed, wellhoused, socially empowered, etc..
The functioning approach is closely linked to the wellknown basic needs approach, and the two are often difficult to distinguish in practice. Functionings, however, are not synonymous with basic needs. Basic needs can be understood as the physical inputs that are usually required for individuals to achieve functionings. Hence, basic needs are usually defined in terms of means rather than outcomes, for instance, as living in the proximity of providers of health care services (but not necessarily being in good health), as the number of years of achieved schooling (but not necessarily being literate), as living in a democracy (but not necessarily participating in the life of the community), and so on. In other words,
Basic needs may be interpreted in terms of minimum specified quantities of such things as food, shelter, water and sanitation that are necessary to prevent ill health, undernourishment and the like (Streeten, Burki, Ul Haq, Hicks, and Stewart 1981).
Unlike functionings, which can be commonly defined for all individuals, the specification of basic needs depends on the characteristics of individuals and of the societies in which they live. For instance, the basic commodities required for someone to be in good health and not to be undernourished will depend on the climate and on the physiological characteristics of individuals. Similarly, the clothes necessary for one not to feel ashamed will depend on the norms of the society in which he lives, and the means necessary to travel, on whether he is handicapped or not. Hence, although the fulfillment of basic needs is an important element in assessing whether someone has achieved some functionings, this assessment must also use information on one's characteristics and socioeconomic environment. Human diversity is such that equality in the space of basic needs generally translates into inequality in the space in functionings.
Whether unidimensional or multidimensional in nature, most applications of both the welfarist and the nonwelfarist approaches to poverty measurement do recognize the role of heterogeneity in characteristics and in socioeconomic environments in achieving wellbeing. Streeten, Burki, UI Haq, Hicks, and Stewart (1981) and others have nevertheless argued that the basic needs approach is less abstract than the welfarist approach in recognizing that role. Indeed, as mentioned above, assessing the fulfillment of basic needs can be seen as a useful practical and operational step towards appraising the achievement of the more abstract "functionings".
Clearly, however, there are important degrees in the multidimensional achievements of basic needs and functionings. For instance, what does it mean precisely to be "adequately nourished"? Which degree of nutritional adequacy is relevant for poverty assessment? Should the means needed for adequate nutritional functioning only allow for the simplest possible diet and for highest nutritional efficiency? These problems also crop up in the estimation of poverty lines in the welfarist approach. A multidimensional approach extends them to several dimensions.
In addition, how ought we to understand such functionings as the functioning of selfrespect? The appropriate width and depth of the concept of basic needs and functionings is admittedly ambiguous, as there are degrees of functionings which make life enjoyable in addition to making it purely sustainable or satisfactory. Furthermore, could some of the dimensions be substitutes in the attainment of a given degree of wellbeing? That is, could it be that one could do with lower needs and functionings in some dimensions if he has high achievements in the other dimensions? Such possibilities of substitutability are generally ignored (and are indeed hard to specify precisely) in the multidimensional nonwelfarist approaches.
A second alternative to the welfarist approach is called the capability approach, also pioneered and advocated in the last three decades by the work of Sen. The capability approach is defined by the capacity to achieve functionings, as defined above. In Sen (1992)'s words,
the capability to function represents the various combinations of functionings (beings and doings) that the person can achieve. Capability is, thus, a set of vectors of functionings, reflecting the person's freedom to lead one type of life or another, (p.40)
What matters for the capability approach is the ability of an individual to function well in society; it is not the functionings actually achieved by the person per se. Having the capability to achieve "basic" functionings is the source of freedom to live well, and is thereby sufficient in the capability approach for one not to be poor or deprived.
The capability approach thus distances itself from achievements of specific outcomes or functionings. In this, it imparts considerable value to freedom of choice: a person will not be judged poor even if he chooses not to achieve some functionings, so long as he would be able to achieve them if he so chose. This distinction between outcomes and the capability to achieve these outcomes also recognizes the importance of preference diversity and individuality in determining functioning choices. It is, for instance, not everyone's wish to be wellclothed or to participate in society, even if the capability is present.
An interesting example of the distinction between fulfilment of basic needs, functioning achievement and capability is given in Townsend (1979)'s (Table 6.3) deprivation index. This deprivation index is built from answers to questions such as whether someone "has not had an afternoon or evening out for entertainment in the last two weeks", or "has not had a cooked breakfast most days of the week". It may be, however, that one chooses deliberately not to have time out for entertainment (he prefers to watch television), or that he chooses not to have a cooked breakfast (he does not want to spend the time to prepare it), although he does have the capacity to have both. That person therefore achieves the functioning of being entertained without meeting the basic need of going out once a fortnight, and he does have the capacity to achieve the functioning of having a cooked breakfast, although he chooses not to have one.
The difference between the capability and the functioning/basic needs approaches is in fact somewhat analogous to the difference between the use of income and consumption as indicators of living standards. Income shows the capability to consume, and "consumption functioning" can be understood as the outcome of the exercise of that capability. There is consumption only if a person chooses to enact his capacity to consume a given income. In the basic needs and functioning approach, deprivation comes from a lack of direct consumption or functioning experience; in the capability approach, poverty arises from the lack of incomes and capabilities, which are imperfectly related to the functionings actually achieved.
Although the capability set is multidimensional, it thus exhibits a parallel with the unidimensional income indicator, whose size determines the size of the "budget set":
Just as the socalled 'budget set' in the commodity space represents a person's freedom to buy commodity bundles, the 'capability set' in the functioning space reflects the person's freedom to choose from possible livings (Sen 1992, p. 40).
This shows further the fundamental distinction between the extents of freedoms and capabilities, the space of achievements, and the resources required to generate these freedoms and to attain these achievements.
To illustrate the relationships between the main approaches to assessing poverty, consider Figure 1.1. Figure 1.1. shows in four quadrants the links between income, consumption of two goods — transportation T and clothing C goods — and the functionings associated to the consumption of each of these two goods. The northeast quadrant shows a typical budget set for the two goods and for a budget constraint Y1. The curve U1 shows the utility indifference curve along which the consumer chooses his preferred commodity bundle, which is here located at point A.
The northwestern and the southeastern quadrants then transform the consumption of goods T and C into associated functionings FT and FC. This is done through the functioning Transformation Curves TCT and TCC, for transformation of consumption of T and C into transportation and clothing functionings, respectively. The curves TCT and TCC appear respectively in the northwest and the southeast quadrants respectively. These curves thus bring us from the northeastern space of commodities, {C, T}, into the southwestern space of functionings, {FC, FT}. Using these transformation functions, we can draw a budget constraint S1 in the space of functionings using the traditional commodity budget constraint, Y1. Since the consumer chooses point A in the space of commodities, he enjoys B's combination of functionings. But all of the functionings within the constraint S1 can also be attained by the consumer. The triangular area between the origin and the line S1 thus represents the individual's capability set. It is the set of functionings which he is able to achieve.
Now assume that functioning thresholds of zC and zT must be exceeded (or must be potentially exceeded) for one not to be considered poor by non welfarist analysts. Given the transformation functions TCT and TCF, a budget constraint Y1 makes the individual capable of not being poor in the functioning space. But this does not guarantee that the individual will choose a combination of functionings that will exceed zC and zT: this also depends on the individual's preferences. At point A, the functionings achieved are above the minimum functioning thresholds fixed in each dimension. Other points within the capability set would also surpass the functioning thresholds: these points are shown in the shaded triangle to the northeast of point B. Since part of the capability set allows the individual to be nonpoor in the space of functionings, the capability approach would also declare the individual not to be poor.
So would conclude, too, the functioning approach since the individual chooses functionings above zC and zT. Such a concordance between the two approaches does not always prevail, however. To see this, consider Figure 1.2. The commodity budget set and the functioning Transformation Curves have not changed, so that the capability set has not changed either. But there has a been a shift of preferences from U1 to U2, so that the individual now prefers point D to point A, and also prefers to consume less clothing than before. This makes his preferences for functionings to be located at point E, thus failing to exceed the minimum clothing functioning zC required. Hence, the person would be considered nonpoor by the capability approach, but poor by the functioning approach. Whether an individual with preferences U2 is really poorer than one with preferences U1 is debatable, of course, since the two have exactly the same "opportunity sets", that is, have access to exactly the same commodity and capability sets.
An important allowance in the capability approach is that two persons with the same commodity budget set can face different capability sets. This is illustrated in Figure 1.3, where the functioning Transformation Curve for transportation has shifted from TCT to . This may due to the presence of a handicap, which makes it more costly in transportation expenses to generate a given level of transportation functioning (disabled persons would need to expend more to go from one place to another). This shift of the TCT curve moves the capability constraint to S1' and thus contracts the capability set.
With the handicap, there is no point within the new capability set that would surpass both functioning thresholds zC and zT. Hence, the person is deemed poor by the capability approach and (necessarily so) by the functioning approach. Whether the welfarist approach would also declare the person to be poor would depend on whether it takes into account the differences in needs implied by the difference between the TCT and the curves.
For the welfarist approach to be reasonably consistent with the functioning and capability approaches, it is thus essential to consider the role of transformation functions such as the TC curves. If this is done, we may (in our simple illustration at least) assess a person's capability status either in the commodity or in the functioning space.
To see this, consider Figure 1.4. Figure 1.4 is the same as Figure 1.1 except for the addition of the commodity budget constraint Y2 which shows the minimum consumption level needed for one not to be poor according to the capability approach. According to the capability approach, the capability set must contain at least one combination of functionings above zC and zT, and this condition is just met by the capability constraint S2 that is associated with the commodity budget Y2. Hence, to know whether someone is poor according to the capability approach, we may simply check whether his commodity budget constraint lies below Y2.
Even if the actual commodity budget constraint lies above Y2, the individual may well choose a point outside the nonpoor functioning set, as we discussed above in the context of Figure 1.2. Clearly then, the minimum total consumption needed for one to be nonpoor according to the functioning or basic needs approach generally exceeds the minimum total consumption needed for one to be non poor according to the capability approach. More problematically, this minimum total consumption depends in principle on the preferences of the individuals. On Figure 1.2, for instance, we saw that the individual with preference U2 was considered poor by the functioning approach, although another individual with the same budget and capability sets but with preferences U1 was considered nonpoor by the same approach.
1 Show on a figure such as Figure 1.1 the impact of an increase in the price of the transportation commodity on the commodity budget constraint and on the capability constraint.
2 On a figure such as Figure 1.4, show the minimal commodity budget set that ensures that the person
(a) is just able to attain one of the two minimum levels of functionings zC or zT;
(b) chooses a combination of functionings such that one of them exceeds the corresponding minimum level of functionings zC or zT;
(c) is just able to attain both minimum levels of functionings zC and zT;
(d) chooses a combination of functionings such that both exceed the corresponding minimum level of functionings zC and zT.
(e) How do these four minimal commodity budget constraints compare to each other? Which one corresponds to the different approaches to assessing poverty seen above?
The measurement of capabilities raises various problems. Unless a person chooses to enact them in the form of functioning achievements, capabilities are not easily inferred. Achievement of all basic functionings implies nondeprivation in the space of all capabilities; but a failure to achieve all basic functionings does not imply capability deprivation. This makes the monitoring of functioning and basic needs an imperfect tool for the assessment of capability deprivation.
Besides, and as for basic needs, there are clearly degrees of capabilities, some basic and some deeper. It would seem improbable that true wellbeing be a discontinuous function of achievements and capabilities. For most of the functionings assessed empirically, there are indeed degrees of achievements, such as for being healthy, literate, living without shame, etc... It would seem important to think of varying degrees of wellbeing in assessing and comparing achievements and capabilities, and not only to record dichotomic 0/1 answers to multidimensional qualitative criteria.
The multidimensional nature of the nonwelfarist approaches also raises problems of comparability across dimensions. How should we assess adequately the wellbeing of someone who has the capability to achieve two functionings out of three, but not the third? Is that person necessarily "better off" than someone who can achieve only one, or even none of them? Are all capabilities of equal importance when we assess wellbeing?
The multidimensionality of the nonwelfarist criteria also translates into greater implementation difficulties than for the usual proxy indicators of the welfarist approach. In the welfarist approach, the size of the multidimensional budget set is ordinarily summarized by income or total consumption, which can be thought of as a unidimensional indicator of freedom. Although there are many different combinations of consumption and functionings that are compatible with a unidimensional moneymetric poverty threshold, the welfarist approach will generally not impose multidimensional thresholds. For instance, the welfarist approach will usually not require for one not to be poor that both food and nonfood expenditures be larger than their respective food and nonfood poverty lines. A similar transformation into a unidimensional indicator is more difficult with the capability and basic needs approaches.
One possible solution to this comparability problem is to use "efficiencyincome units reflecting command over capabilities rather than command over goods and services" (Sen 1985, p.343), as we illustrated above when discussing Figure 1.4. This, however, is practically difficult to do, since command over many capabilities is hard to translate in terms of a single indicator, and since the "budget units" are hardly comparable across functionings such as wellnourishment, literacy, feeling selfrespect, and taking part in the life of the community. On Figure 1.4, anyone with an income below Y2 would be judged capabilitypoor. But by how much does poverty vary among these capabilitypoor? A natural measure would be a function of the budget constraint. It is more difficult to make such measurements and comparisons within the nonmoneymetric capability set.
The measurement of wellbeing and poverty plays a central role in the discussion of public policy. It is used, among other things, to identify the poor and the nonpoor, to design optimal poverty targeting schemes, to estimate the errors of exclusion and inclusion in the targeting set (also known as Type I and Type II errors), and to assess the equity of poverty alleviation policies. Is growth "propoor"? How do indirect taxes and relative price changes affect the poor? What should the target groups be for sociallyimproving government interventions? What impact do transfers have on poverty? Is it the poorest of the poor who benefit most from public policy?
An important example of the central role of poverty measurement in the setting of public policy is the optimal selection of safety net targeting indicators. The theory of optimal targeting suggests that it will commonly be best to target individuals on the basis of indicators that are as easily observable and as exogenous as possible, while being as correlated as possible with the true poverty status of the individuals. Indicators that are not readily observable by program administrators are of little practical value. Indicators that can be changed effortlessly by individuals will be distorted by the presence of the program, and will lose their povertyinformative value. Whether easily observable and sufficiently exogenous indicators are sufficiently correlated with the deprivation of individuals in a population is given by a poverty profile. The value of this profile will naturally be highly dependent on the approach used and the particular assumptions made to measure wellbeing and poverty.
Estimation of inclusion and exclusion errors is also a product of poverty profiling and measurement. These errors are central in the tradeoff involved in choosing between a wide coverage of the population — at relatively low administrative and efficiency costs — and a narrower coverage — with more generous support for the fewer beneficiaries. Indeed, as van de Walle (1998a) puts it, a narrower coverage of the population, with presumably smaller errors of inclusion of the nonpoor, does not inevitably lead to a more equitable treatment of the poor:
Concentrating solely on errors of leakage to the nonpoor can lead to policies which have weak coverage of the poor (p.366).
The terms of this tradeoff are again given by a poverty assessment exercise.
Another lesson of optimal redistribution theory is that it is usually better to transfer resources from groups with a high level of average wellbeing to those with a lower one. What matters more, however, is the distribution of wellbeing within each of the groups. For instance, equalizing mean wellbeing across groups does not ordinarily eliminate poverty since there generally exist withingroup inequalities. Even within the richer group, for instance, there normally will be found some deprived individuals, whom a richtopoor crossgroup redistributive process would clearly not take out of poverty. The within and betweengroup distribution of wellbeing that is required for devising an optimal redistributive scheme can again be revealed by a comprehensive poverty profile.
The distinction between the welfarist and nonwelfarist approaches to poverty measurement often matters (implicitly or explicitly) for the assessment and the design of public policy. As described above, a welfarist approach holds that individuals are the best judges of their own wellbeing. It would thus in principle avoid making appraisals of wellbeing that conflict with the poor's views of their own situation. A typical example of a welfarist public policy would be the provision of adequate incomegenerating opportunities, letting individuals decide and reveal whether these opportunities are utility maximizing, keeping in mind the other nonincomegenerating opportunities that are available to them.
A nonwelfarist policy analyst would argue, however, that providing income opportunities is not necessarily the best policy option. This is partly because individuals are not necessarily best left to their own resolutions, at least in an intertemporal setting, regarding educational and environmental choices for example. The poor's shortrun preoccupations may, for instance, harm their longterm selfinterest. Individuals may choose not to attend skillenhancing programs because they deceivingly appear overly time costly in the shortrun, and because they are not sufficiently aware or convinced of their longterm benefits. Besides, if left to themselves, the poor will not necessarily spend their income increase on functionings that basicneeds analysts would normally consider a priority, such as good nutrition and health.
Thus, fulfilling "basic needs" cannot be satisfied only by the generation of private income, but may require significant amounts of targeted and inkind public expenditures on areas such as education, public health and the environment. This would be so even (and especially) if the poor did not presently believe that these areas were deserving of public expenditures. Furthermore, social cohesion concerns are arguably not well addressed by the maximization of private utility, and raising income opportunities will not fundamentally solve problems caused by adverse intrahousehold distributions of wellbeing, for instance.
An objection to the basic needs approach is that it is clearly paternalistic since it supposes that it must be in the absolute interests of all to meet a set of often arbitrarily specified needs. Indeed, as emphasized above, nonwelfarist approaches generally use criteria for identifying and helping the poor that may conflict both with the poor's preferences and with their utility maximizing choices. The welfarist school conversely emphasizes that individuals are generally better placed to judge what is good for them. For instance,
To conclude that a person was not capable of living a long life we must know more than just how long she lived: perhaps she preferred a short but merry life. (Lipton and Ravallion 1995)
To force that person to live a long but boring life might thus go against her preferences.
For poverty alleviation purposes, the prescriptions of nonwelfarist approaches could in principle go as far as, for instance, enforced enrolment in community development programs, forced migration, or forced family planning. This may not only conflict with the preferences of the poor, but would also clearly undermine their freedom to choose. Freedom to choose is, however, arguably one of the most important basic capabilities that contribute fundamentally to wellbeing.
A further example of the possible tension between the welfarist and nonwelfarist influences on public policy comes from optimal taxation theory, which is linked to the theory of optimal poverty alleviation. In the tradition of classical microeconomics, which values leisure in the production and labor market decisions of individuals, pure welfarists would incorporate the utility of leisure in the overall utility function of workers, poor and nonpoor alike. In its support to the poor, the government would then take care of minimizing the distortion of their labor/leisure choices so as not to create overly high "deadweight losses". Classical optimal taxation theory then shows that being concerned with such things as labor/leisure distortions implies a generally lower benefit reduction rates on the income of the poor than otherwise. Taking into account such abstract things as "deadweight losses" is, however, less typical of the basic needs and functioning approaches. Such approaches would, therefore, usually target program benefits more sharply on the poor, and would exact steeper benefit reduction rates as income or wellbeing increases.
Relative to the pure welfarist approach, nonwelfarist approaches are also typically less reluctant to impose utilitydecreasing (or "workfare") costs as side effects of participation in poverty alleviation schemes. These side effects are in fact often observed in practice. For instance, it is wellknown that income support programs frequently impose participation costs on benefit claimants. These are typically nonmonetary costs. Such costs can be both physical and psychological: providing manual labor, spending time away from home, sacrificing leisure and home production, finding information about application and eligibility conditions, corresponding and dealing with the benefit agency, queuing, keeping appointments, complying with application conditions, revealing personal information, feeling "stigma" or a sense of guilt, etc...
Although nonmonetary, these costs impact on participants' net utility from participating in the programs. When they are negatively correlated with unobserved (or difficult to observe) entitlement indicators, they can provide selfselection mechanisms that enhance the efficiency of poverty alleviation programs, for welfarists and nonwelfarists alike. One unfortunate effect of these costs is, however, that many trulyentitled and truly deserving individuals may shy away from the programs because of the costs they impose. Although program participation could raise their income and consumption above a moneymetric poverty line, some individuals will prefer not to participate, revealing that they find apparent poverty utility greater than that of program participation. Welfarists would in principle take these costs into account when assessing the merits of the programs. Nonwelfarists would usually not do so, and would therefore judge such programs more favorably.
Finally, the width of the definition of functionings is also important for the design and the assessment of public policy. For instance, public spending on education is often promoted on the basis of its impact on future productivity and growth. But education can also be seen as a means to attain the functioning of literacy and participation in the community. This then provides an additional support for public expenditures on education. Analogous arguments also apply, for instance, to public expenditures on health, transportation, and the environment.
The empirical assessment of poverty and equity is customarily carried out using data on households and individuals. These data can be administrative (i.e., stored in government files and records), they can come from censuses of the entire population, or (most commonly) they can be generated by probabilistic surveys on the sociodemographic characteristics and living conditions of a population of households or individuals. We focus on this latter case in this chapter.
There are several aspects of the surveying process that are important for assessment of poverty and equity. First, there is the coverage of the survey: does it contain representative information on the entire population of interest, or just on some socioeconomic subgroups? Whether the representativeness of the data is appropriate depends on the focus of the assessment. A survey containing observations drawn exclusively from the cities of a particular country may be perfectly fine if the aim is to design poverty alleviation schemes within these cities; its representativeness will, however, be clearly insufficient if the objective is to investigate the optimal allocation of resources between the country's urban and rural areas.
Then, there is the sample frame of the survey. Surveys are usually stratified and multistaged, and are therefore made of stratified and clustered observations. Stratification ensures that a certain minimum amount of information is obtained from each of a given number of "areas" within a population of interest. Population strata are often geographically defined and typically represent different regions or provinces of a country. Clustering facilitates the interviewing process by concentrating sample observations within particular population subgroups or geographic locations. They thus make it more cost effective to collect more observations. Strata are thus often divided into a number of different levels of clusters, representing, say, cities, districts, neighborhoods, and households. A complete listing of firstlevel clusters in each stratum is used to select randomly within each stratum a given number of clusters. The initial clusters can then be subjected to further stratification or clustering, and the process continues until the last sampling units (usually households or individuals) have been selected and interviewed. This therefore leads to both stratification and multistage sampling.
Fundamental in the use of survey data is the role of the randomness of the information that is generated by the sampling process. Because households and individuals are not all systematically interviewed (unlike in the case of censuses), the information generated from survey data will depend on the particular selection of households and individuals that is made from a population. In other words, a poverty/equity assessment of a population will vary according to the sample drawn from that population. For that reason, distributive assessments carried out using survey data will be subject to socalled "sampling errors", that is, to sampling variability. When carrying out distributive using sample data, it is therefore important to recognise and assess the importance of sampling variability.
By ensuring that a minimum amount of information (typically, a minimum number of observations) is obtained from each of a number of strata, stratification decreases the extent of sampling errors. A similar effect is obtained by increasing the total size of the sample: the greater the number of households surveyed, the greater on average is the sampling precision of the estimates obtained. Conversely, by bundling observations around common geographic or socioeconomic indicators, clustering tends to reduce the informative content of the observations drawn and thus also tends to increase the size of the sampling errors (for a given number of observations). The sampling structure of a survey also impacts on its ability to provide accurate information on certain population subgroups. For instance, if the clusters within a stratum represent geographical districts, and betweendistrict variability is large, it would be unwise to use the information generated by the selected regions to depict poverty in the other, nonselected, regions.
Survey data are also fraught with measurement and other "nonsampling" errors. For instance, even though they may have been selected to belong to a sample, some households may end up not being interviewed, either because they cannot be reached or because they refuse to be interviewed. Such "nonresponse" will raise difficulties for distributive assessments if it is correlated with observable and nonobservable household characteristics. Even if interviewed, households will sometimes misreport their characteristics and living conditions, either because of ignorance, misunderstanding or mischief. This tends to make distributive assessments built from survey data diverge systematically from the true (and unobserved) population distributive assessment that would be carried out if there were no nonsampling errors. Clearly, such a shortcoming can bias the understanding of poverty and equity and the subsequent design of public policy.
The empirical analysis of vulnerability and poverty dynamics is particularly "data demanding". In general, it requires longitudinal (or panel) surveys, surveys that follow each other in time and that interview the same final observational units. Because they link the same units across time, longitudinal data contain more information than transversal (or crosssectional) surveys, and they are particularly useful for measuring vulnerability and for understanding poverty dynamics — in addition to facilitating the assessment of the temporal effects of public policy on wellbeing. Note, however, that measurement errors are particularly problematic for the analysis of vulnerability and mobility.
It is frequently argued that consumption is better suited than income as an indicator of living standards, at least in many developing countries. One reason is that consumption is believed to vary more smoothly than income, both within a given year and across the life cycle. Income is notoriously subject to seasonal variability, particularly in developing countries, whereas consumption tends to be less variable. Lifecycle theories also predict that individuals will try to smooth their consumption across their low and highincome years (in order to equalize their "marginal utility of consumption" across time), through appropriate borrowing and saving behavior. In practice, however, consumption smoothing is far from perfect, in part due to imperfect access to commodity and credit markets and to difficulties in estimating precisely one's "permanent" or lifecycle income. Using shortterm vs longerterm consumption or income indicators can therefore change the assessment of wellbeing.
For the nonwelfarist interested in outcomes and functionings, consumption is also preferred over income because it is deemed to be a more "direct" indicator of achievements and fulfilment of basic needs. A caveat is, however, that consumption is an outcome of individual free choice, an outcome which may differ across individuals of the same income and ability to consume, just like actual functionings vary across people of the same capability sets. At a given capability to spend, some individuals may choose to consume less (or little), preferring instead to give to charity, to vow poverty, or to save in order to leave important bequests to their children.
Consumption is also held to be more readily observed, recalled and measured than income (at least in developing countries, although even then this is not always the case), to suffer less from underreporting problems, This is not to say that consumption is easy to measure accurately. Sources of income are typically far more limited than types of expenditures, which can make it easier to collect income information. The periodicity of expenses on various goods varies, and different recall periods are therefore needed to ensure adequate expenditure coverage.
Moreover, consumption does not equal expenditures. The value of consumption equals the sum of the expenditures on the goods and services purchased and consumed in a given period, plus the value of goods and services consumed but not purchased (such as those received as gifts and produced by the household itself), plus the consumption or service value of assets and durable goods owned. Unlike expenditures, therefore, consumption includes the value of ownproduced goods. The value of these goods is not easily assessed, since it has not been transacted in a market. Distinguishing consumption from investment is also very difficult, but failure to do so properly can lead to doublecounting in the consumption measure. For instance, a $1 expenditure on education or machinery should not be counted as current consumption if the returns and the utility of such expenditure will only accrue later in the form of higher future utility and earnings.
Similarly, and as just mentioned, the value of the services provided by those durable goods owned by individuals ought also to enter into a complete consumption indicator, but the cost of these durable goods should feature in the consumption aggregate of the time at which the good was purchased. An important example of this is owneroccupied housing. Further measurement difficulties arise in the assessment of the value of various nonmarket goods and services – such as those provided freely by the government – and the value of intangible benefits such as the quality of the environment, the benefit of peace and security, and so on.
Whether it is income or consumption expenditures that are measured and compared, an important issue is how to account for the variability of prices across space and time. Conceptually, this also encompasses variability in quality and in quantity constraints. Failure to account for such variability can distort comparisons of wellbeing across time and space. In Ecuador, for instance (Hentschel and Lanjouw 1996), and in many other countries, some households have free access to water, and tend to consume relatively large quantities of it with zero water expenditure. Others (often periurban dwellers) need to purchase water from private vendors and consequently consume a lower quantity of it at necessarily higher total expenditures. Ranking households according to water expenditures could wrongly suggest that those who need to buy water are better off and derive greater utility from water consumption (since they spend more on it).
Microeconomic theory suggests that we may wish to account for price variability by comparing real as opposed to nominal consumption expenditures (or income). Several procedures can be followed to enable such comparisons. A first procedure estimates the parameters of consumers' indirect utility functions. Let these parameters be denoted by ν and the indirect utility function be defined by V(y, q, ν), where q is the price vector and y is total nominal expenditure (we abstract from savings). Suppose that reference prices are given by qR. Equivalent consumption expenditure is then given implicitly by yR:
Inversion of the indirect utility function yields an equivalent expenditure function e, which indicates how much expenditure at reference prices is needed to be equivalent to (or to generate the same utility as) the expenditure observed at current prices q:
Distributive analysis would then proceed by comparing the real incomes defined in terms of the reference prices qR.
An alternative procedure deflates by a costofliving index the level of total nominal consumption expenditures. One way of defining such a costofliving index is to ask what expenditure is needed just to reach a poverty level of utility vz at prices q. This is given by e (q,ν,υz). A similar computation is carried out for the expenditure needed to attain υz at prices qR: this is e (qR,ν,υz). The ratio
is then a costofliving index. Dividing y by (2.3) yields real consumption expenditure.
In practice, costofliving indices are often taken to be those aggregate consumer price indices routinely computed by national statistical agencies. These consumer price indices usually vary across regions and time, but not across levels of income (e.g., across the poor and the nonpoor). In some circumstances (i.e., for homothetic utility functions and when consumer preferences are identical), all of the above procedures are equivalent. In general, however, they are not the same.
The fact that utility functions are not generally homothetic, and that preferences are highly heterogeneous, has important implications for distributive analysis and public policy. First, the true costofliving index would normally be different across the poor and the rich. Using the same price index for the two groups may distort comparisons of wellbeing. An example is the effect of an increase in the price of food on economic welfare. Since the share of food in total consumption is usually higher for the poor than for the rich, this increase should hurt disproportionately more the poor. Deflating nominal consumption by the same index for the entire population will, however, suggest that the burden of the food price increase is shared proportionately by all.
Spatial disaggregation is also important if consumption preferences and price changes vary systematically across regions. In few developing countries, however, are consumer price indices available or sufficiently disaggregated spatially. The alternative for the analyst is then to produce different poverty lines for different regions (based on the same or different consumption baskets, but using different prices) or construct separate price indices. In both cases, the analyst would usually be using regional price information derived from consumption survey data. The resulting indices would then be interpreted as costofliving indices, and could help correct for spatial price variation and regional heterogeneity in preferences.
To see why these adjustments are necessarily in part arbitrary, and to see why they can matter in practice, consider the case of Figure 2.1. It shows 3 indifference curves U1, U2 and U3, for three consumers, 1, 2 and 3. Two of these consumers have relatively strong preferences for meat as opposed to fish, and the third (represented by U3) has strong preferences for fish. Also shown are two budget constraints, one using relative prices qc (c for coastal area), where the price of fish is relatively low, and the other with qm (m for mountainous area), where the price of fish is high compared to the price of meat.
How is the standard of living for individuals 1, 2 or 3 to be compared? One way to answer this question is to "cost" the consumption of the three individuals. For this, we may use either qc or qm. If we use the mountains' relative price, then the consumption bundles chosen by individuals 1 and 3 are equivalent in terms of value: they lie on the same budget constraint of value B in terms of meat (the numéraire). Individual 2 is clearly then the worst off of all three. If instead we use the coastal area's relative price, then the consumption bundles chosen by individuals 2 and 3 are equivalent, with a common value of A in terms of meat – and individual 1 is the best off.
Hence, choosing reference prices to assess and compare living standards can matter significantly. If we knew a priori that individuals 1 and 3 had equivalent living standards, then reference prices qm would be the right ones (conversely: qc would be the correct reference prices if 2 and 3 could be assumed to be equally well off). But such information is generally not available. In some circumstances, such as in comparing 1 and 2, we can be fairly certain that one individual is better off than another, whatever the choice of reference prices, but even then, the extent of the quantitative difference in wellbeing can can vary to a large extent with the choice of reference prices.
The choice of reference prices and reference preferences will also matter for estimating the impact of price changes on wellbeing and equity. Consider
again Figure 2.1.. Suppose that we wish to measure the impact on consumers' wellbeing of an increase in the price of fish. Assume for simplicity that this change in relative prices is captured by a move from qm to qc. If we were to choose as a reference bundle the bundle of meat and fish chosen by individuals 1 and 2 to capture the impact of this change, then the price impact would be estimated to be fairly low. The reason is that both individuals consume little of fish. For instance, take meat as the numéraire and assess the real income value of being at U1. Under qm, this is given by B and under qc by D. Using instead the preferences of individual 3 as reference tastes (and thus U3 as reference wellbeing), real consumption would move from A to B, a much greater change.
Furthermore, even if 3 were deemed better off than 1 before the increase in the price of fish, it could well be that 3's strong preferences for fish would make him less well off than 1 after the price change. Hence, when consumer preferences are heterogeneous, price changes can reverse rankings of wellbeing. Indeed, in Figure 2.1., the increase in the price of fish is visibly much more costly for fish eaters than for meat ones. This warns again against the use of a common price index across all regions as well as across all socioeconomic groups – rich and poor.
Suppose the following direct utility function over the two goods x1 and x2,
with υ= 1/3, and let prices q1and q2be set to 1.
1 What is the expenditure needed to attain a poverty level of utility of 158.74 at the reference prices =1 and =1? (Call this zR.)
answer: see table 2.1, zR = 300 for U = 159.78$
2 What are the quantities of goods 1 and 2 that are consumed at the poverty level of utility?
answer: see table 2.1, x1 = 100 and x2 = 200 for U = 159.78$
3 Suppose that the price of good 2 is increased from 1 to 3. What is the new cost of the level of poverty utility? (Call this z.)
answer: see table 2.2, z = 624 for U = 159.78$
4 Using definitions (2.1) and (2.2), prove the following:
yR/zR = y/z.
What does it imply?
answer: When preferences are homothetic, poverty measures are the same for two following methods that one can use to adjust the nominal income:
Equivalent expenditure method: y* = e(qR, xR,υ(q,x,y))
Welfare ratio method: y* = y/z
5 Suppose now that a poverty analyst does not believe that consumption of goods 1 and 2 will adjust following good 2's price increase. What is the poverty line z that he would then obtain? (Hint: compute the cost of the initial commodity basket using the new prices.)
answer: z = 700$
6 Using indifference curves and budget constraints, show the difference that taking account of changes in behavior can make for the computation of price indices and the assessment of poverty.
i 
y 
x1 
x2 
U 
y/z 
1 
150.00 
50.00 
100.00 
79.37 
0.50 
2 
210.00 
70.00 
140.00 
111.12 
0.70 
3 
300.00 
100.00 
200.00 
158.74 
1.00 
4 
380.00 
126.67 
253.33 
201.07 
1.27 
5 
500.00 
166.67 
333.33 
264.57 
1.67 
6 
510.00 
170.00 
340.00 
269.86 
1.70 
7 
550.00 
183.33 
366.67 
291.02 
1.83 
8 
600.00 
200.00 
400.00 
317.48 
2.00 
9 
800.00 
266.67 
533.33 
423.31 
2.67 
10 
1000.00 
333.33 
666.67 
529.13 
3.33 
i 
y 
x1 
x2 
U 
yR 
y/z 
yR/zR 
y/700 
1 
160.00 
53.33 
35.56 
40.70 
76.92 
0.26 
0.26 
0.23 
2 
200.00 
66.67 
44.44 
50.88 
96.15 
0.32 
0.32 
0.29 
3 
500.00 
166.67 
111.11 
127.19 
240.37 
0.80 
0.80 
0.71 
4 
624.00 
208.00 
138.67 
158.74 
300.00 
1.00 
1.00 
0.89 
5 
1100.00 
366.67 
244.44 
279.82 
528.82 
1.76 
1.76 
1.57 
6 
1240.00 
413.33 
275.56 
315.43 
596.13 
1.99 
1.99 
1.77 
7 
1300.00 
433.33 
288.89 
330.70 
624.97 
2.08 
2.08 
1.86 
8 
1500.00 
500.00 
333.33 
381.57 
721.12 
2.40 
2.40 
2.14 
9 
1600.00 
533.33 
355.56 
407.01 
769.20 
2.56 
2.56 
2.29 
10 
2770.00 
923.33 
615.56 
704.64 
1331.68 
4.44 
4.44 
3.96 
A fundamental problem arises when comparing the wellbeing of individuals who live in households of differing sizes and composition. Differences in household size and composition can indeed be expected to create differences in household "needs". It is essential to take these differences in needs into account when comparing the wellbeing of individuals living in differing households. This is usually done using equivalence scales. With these scales, the needs of a household of a particular size and composition are compared to those of a reference household, usually one made of one reference adult.
Strategies for the estimation of equivalence scales are all contingent on the choice of comparable indicators of wellbeing. The choice of any such indicator is, however, intrinsically arbitrary. A popular example is food share in total consumption: at equal household food shares, individuals of various household types are assumed to be equally welloff. But, at equal wellbeing, one household type could certainly choose a food share that differs from that of other household types. This would be the case, for instance, for households of smaller sizes for which it could make perfect sense to spend more on food than on those goods for which economies of scale are arguably larger, such as housing. Failing to take this differential price effect into account would lead to an overestimation of the needs of small households.
Another difficulty arises when household size and composition are the result of a deliberate free choice. It may be argued, for instance, that a couple which elects freely to have a child cannot perceive this increase in household size to be utility decreasing. This would be so even if the household's total consumption remained unchanged after the birth of the child (or even if it fell), despite the fact that most poverty analysts would judge this birth to increase household "needs". Another difficulty lies in the fact that the intrahousehold decisionmaking process can distort the allocation of resources across household members, and thereby lead to wrong inferences of comparative needs. This is the case, for instance, when more is spent on boys than on girls, not because of greater boy needs, but because of differential gender preferences on the part of the household decisionmaker. Such observations can lead analysts to overestimate the real needs of boys relative to those of girls. In turn, this would underestimate on average the level of deprivation experienced by girls and their households, since it would be wrongly assumed that girls are less "needy". An analogous analytical difficulty arises when the household decisionmaker is a man, and the consumption of his spouse is observed to be smaller than his own. Is this due to genderbiased household decisionmaking, or to genderdifferentiated needs?
To illustrate these issues, consider Figure 2.2, which graphs consumption of a reference good xr(y,q) against household income y. The predicted consumption of the reference good is plotted for two households, the first composed of only one man, and the other made of a couple (i.e., a man and a woman). A common procedure in the equivalence scales literature is the estimation of the total household income at which a reference consumption of a reference good is equal for all household types. The basic argument is that when the consumption of that reference good is the same across households, the wellbeing of household members should also be the same across households. Reference goods are often goods consumed exclusively by some members of the household, such as adult clothing.
For Figure 2.2., take for instance the case of men's clothing for xr(y,q). Suppose that the reference level of that good is given by x0. Leaving aside issues of consumption heterogeneity within households of the same type at a given income level, one would estimate that the onemember household would need an income yc in order to consume x0 (at point c), and that the twomember household would require total household income yd to reach that same reference consumption level. Hence, following this line of argument, the second household would need yd/yc as much income as the first one to be "as well off in terms of consumption of men's clothing. Said differently, the second household's needs would be yd/yc that of the single man household. The number of "equivalent adults" in the second household would then be said to be yd/yc. When applied to different household types, this procedure provides a full equivalence scale, expressing the needs of various household types as a function of those of a reference household.
Such a procedure faces many problems, however, most of which are very difficult to resolve. First, there is the choice of the reference level of xr(y,q). If a reference level of x1 instead of x0 were chosen in Figure 2.2., the number of adult equivalents in the second household would fall from yd/yc to yf/ye. There is little that can be done in general to determine which of these two scales is the right one. In such cases, one cannot use a welfareindependent equivalence scale – the equivalence scale ratios must depend on the levels of the households' reference wellbeing.
Equivalence scale estimates also generally depend on the choice of the reference good. For instance, the choice of adult clothing versus that of tobacco, alcohol or other adult commodities will generally matter in trying to compare the needs, say, of households with and without children. This is in part because preferences for these goods are not independent of – and do not depend in the same manner on – household composition. One additional problem is the issue of the price dependence of equivalence scale estimates. Choosing a different q in Figure 2.2., for instance, would usually lead to the estimation of different equivalence scale ratios.
In view of these difficulties, the literature has often emphasized that the choice of a particular scale inevitably introduces value judgements and some
arbitrariness. It would therefore seem important to recognize explicitly such difficulties when measuring and comparing poverty and inequality levels.
Allowing the assessment of needs to vary turns out to be especially relevant in crosscountry comparative analyses, particularly when those countries compared differ significantly in their socioeconomic composition. There is in this case the added issue that not only can the appropriate scale rates be uncertain in a given country, but they may also be different between countries. Testing the sensitivity of inequality and poverty results to changes in the incorporation of needs would seem particularly important for those comparisons whose results can influence redistributive policies, e.g., through the transfer of resources from some regions or household types to others.
To see how to carry out such sensitivity analysis, define an equivalence scale E as a function of household needs. This function will typically depend on the characteristics of the M different household members, such as their sex and age, and on household characteristics, such as location and size. Because E is normalized by the needs of a single adult, it can be interpreted as a number of "equivalent adults", viz, household needs as a proportion of the needs of a single adult. A "parametric" class of equivalence scales is often defined as a function of one or of a few relevant household characteristics, with parameters indicating how needs are modified as these characteristics change.
A survey of Buhmann, Rainwater, Schmaus, and Smeeding (1988) reported 34 different scales from 10 countries, which they summarized as
with s being a single parameter summarizing the sensitivity of E to household size M. This needs elasticity, s, can be expected to vary between 0 and 1. For s = 0, no account is taken of household size. For s = 1, adultequivalent income is equal to per capita household income. The larger the value of s, the smaller are the economies of scale in the production of wellbeing that are implicitly assumed by the equivalence scale, and the greater is the impact of household size upon household needs.
An obvious limitation of a simple function such as (2.4) is its dependence solely on household size and not on household composition or other relevant sociodemographic characteristics. Most equivalence scales do indeed distinguish strongly between the presence of adults and that of children, and some like that of McClements (1977) even discriminate finely between children of different ages. An example of a class of equivalence scales that is more flexible than the above was suggested by Cutler and Katz (1992) – this class takes separately into account the importance of the MA adults and the M  MAchildren:
where c is a constant reflecting the resource cost of a child relative to that of an adult, and s is now an indicator of the degree of overall economies of scale within the household. When c = 1, children count as adults (which is the assumption made in (2.4)); otherwise, adults and children are assumed to have different needs.
Finally, and as elsewhere in distributive analysis, there is the practically insoluble difficulty of having to make interpersonal comparisons of wellbeing across individuals – compounded by the fact that individuals here are heterogeneous in their household composition. On the basis of which observable variable can we really make interpersonal comparisons of wellbeing? Again, note that the assumption that wellbeing for the man is the same as wellbeing for the couple when xr(y,q) is equalized in Figure 2.2. is a very strong one. Furthermore, apart from influencing preferences and commodity consumption, household formation is as indicated above itself a matter of choice and is presumably the source of utility in its own right. Preferences for household composition are themselves heterogeneous, and so is the utility derived from a certain household status. All of this makes comparisons of wellbeing across heterogeneous individuals and the use of equivalence scales the source of arbitrariness and significant measurement errors.
An additional problem in measuring individual living standards using survey data comes from the presence of intrahousehold inequality. The final unit of observation in surveys is customarily the household. Little information is typically generated on the intrahousehold allocation of wellbeing (e.g., on the individual benefits stemming from total household consumption). Because of this, the usual procedure is to assume that adultequivalent consumption (once computed) is enjoyed identically by all household members.
This, however, is at best an approximation of the true distribution of economic wellbeing in a household. If the nature of intrahousehold decisionmaking leads to important disparities in wellbeing across individuals, assuming equal sharing will underestimate inequality and aggregate poverty. Not being able to account for intrahousehold inequities will also have important implications for profiling the poor, and also for the design of public policy. For instance, a poverty assessment that correctly showed the deprivation effects of unequal sharing within households could indicate that it would be relatively inefficient to target support at the level of the entire household – without taking into account how the targeted resources would subsequently be allocated within the household. Instead, it might be better to design public policy such as to selfselect the least privileged individuals within the households, in the form of specific inkind transfers or specially designed incentive schemes.
A final and related difficulty concerns who we are counting in aggregating poverty: is it individuals or households? Although this distinction is fundamental, it is often surprisingly hidden in applied poverty profiles and poverty measurement papers. The distinction matters since there is usually a strong positive correlation between household size and a household's poverty status. Said differently, poverty is usually found disproportionately among the larger households. Because of this, counting households instead of individuals will typically underestimate the true proportion of individuals in poverty.
The literature investigating the foundations and the impact of alternative approaches to measuring wellbeing is large and (yet) rapidly increasing.
Influential discussions of the conceptual foundations can be found in Dasgupta (1993), Sen (1981), Sen (1983), Sen (1985), Streeten, Burki, Ul Haq, Hicks, and Stewart (1981) and Townsend (1979).
Papers considering the impact of the accounting period (e.g., shortterm vs longterm incomes) on the distribution of wellbeing include Aaberge, Bjorklund, Jantti, Palme, Pedersen, Smith, and Wennemo (2002), Arkes (1998), Bjorklund (1993), Burkhauser, Frick, and Schwarze (1997), Burkhauser and Poupore (1997), Coronado, Fullerton, and Glass (2000), Creedy (1997), Creedy (1999a), Gibson, Huang, and Rozelle (2001), Harding (1993) and Parker and Siddiq (1997).
The comprehensiveness of income concepts can also make a difference. A good introduction to the general methodological issues is Hentschel and Lanjouw (1996). The impact of the difference between cash and more comprehensive measures of income is studied inter alia in Formby, Kim, and Zheng (2001), Gustafsson and Makonnen (1993), Gustafsson and Shi (1997), Harding (1995), Jenkins and O'Leary (1996), Smeeding, Saunders, Coder, Jenkins, Fritzell, Hagenaars, Hauser, and Wolfson (1993), Smeeding and Weinberg (2001), Van den Bosch (1998), and Yates (1994). The role of public services is also discussed in Anand and Ravallion (1993); see also Propper (1990) and ?) for adjusting the value of public services for the costs of accessing them.
The sensitivity of the measurement of wellbeing to the choice between consumption and income measures is analyzed in Barrett, Crossley, and Worswick (2000b), Barrett, Crossley, and Worswick (2000a), Blacklow and Ray (2000), Blundell and Preston (1998), Cutler and Katz (1992), Jorgenson (1998) Mitrakos and Tsakloglou (1998), O'Neill and Sweetman (2001), Slesnick (1993) and Zaidi and de Vos (2001).
Choosing the units of analysis, be they individuals, households or equivalent adults, also influences distributive analysis, as studied by Bhorat (1999), Carlson and Danziger (1999), Ebert (1999), and Sutherland (1996). This is closely related to the growing concerns expressed about the role of income pooling/sharing within families and households; see for instance Cantillon and Nolan (2001), Haddad and Kanbur (1990), Jenkins (1991), Kanbur and Haddad (1994), Lazear and Michael (1988), Lundberg, Pollak, and Wales (1997), Phipps and Burton (1995), Quisumbing, Haddad, and Pena (2001), and Woolley and Marshall (1994). Ebert and Moyes (2003) explore the normative implications of a concern for equality in living standards for the use of equivalence scales in applied studies.
Price adjustments can also be important to making consistent comparisons of wellbeing across time, space and socioeconomic groups, and for measuring equity and poverty properly. A good introduction to the methodological literature is given by Donaldson (1992). Empirical evidence can be found in Araar (2002), Bodier and Cogneau (1998), Deaton (1988), Erbas and Sayers (1998), Finke, Chern, and Fox (1997), Idson and Miller (1999), Muller (2002), Pendakur (2002), Rao (2000), Ruiz Castillo (1998) and Slesnick (2002).
Justification and examples of the use by economists of nonmoneymetric measures of wellbeing can be found inter alia in De Gregorio and Lee (2002) (for a link between education and income inequality), Haveman and Bershadker (2001) (selfreliance), Jensen and Richter (2001) (children's health), Klasen (2000) and Layte, Maitre, Nolan, and Whelan (2001) (deprivation), Sahn and Stifel (2000) (a composite welfare index), Sefton (2002) (fuel poverty) and Skoufias (2001) (calorie intake). The wealth distribution is also often of interest: see for instance Wolff (1998) for a review of the American evidence. Alternative measures of wellbeing are also explored in Davies, Joshi, and Clarke (1997) (for a construction of a deprivation index), Desai and Shah (1988) (for estimates of relative deprivation), Hagenaars (1986) (for perceptions of poverty), and Narayan and Walton (2000) (for participatory evidence on the living conditions and views of more than 20,000 poor people).
Survey measurement problems are numerous. See for instance Fields (1994) for a general discussion, Juster and Kuester (1991) for wealth measurement, and Lanjouw and Lanjouw (2001) for the estimation of food and nonfood consumption expenditures.
The sensitivity of distributive analysis to the "equivalization" of incomes has been the focus of much work in the last 15 years. This includes Banks and Johnson (1994), Bradbury (1997), Buhmann, Rainwater, Schmaus, and Smeeding (1988), Burkhauser, Smeeding, and Merz (1996), Coulter, Cowell, and Jenkins (1992b), Coulter, Cowell, and Jenkins (1992a), de Vos and Zaidi (1997), Duclos and Mercader Prats (1999), Jenkins and Cowell (1994), Lancaster and Ray (2002), Lanjouw and Ravallion (1995), Lyssiotou (1997), Meenakshi and Ray (2002), Phipps (1993), and Ruiz Castillo (1998).
The econometric and theoretical difficulties involved in the estimation of equivalence scales are formidable, and these are discussed inter alia in Blundell and Lewbel (1991), Blundell (1998), and Pollak (1991). Estimation of equivalence scales is performed in Bosch (1991), Nicol (1994), Pendakur (1999) (where they are found to be "baseindependent"), Pendakur (2002) (where they are found to be pricedependent), Phipps and Garner (1994) (where they are found to be different across Canada and the United States), Phipps (1998), and Radner (1997) (where they depend on the types of income considered).
An attempt to identify and estimate unconditional preferences for goods and demographic characteristics is Ferreira, Buse, and Chavas (1998). Whether equivalence scales should be incomedependent, and what happens if they are, is studied among others in Aaberge and Melby (1998), Blackorby and Donaldson (1993), and Conniffe (1992).
The normative issues raised by the presence of heterogeneity in the population – heterogeneity other than in the dimension of income – are numerous, and some of them are examined in Ebert and Moyes (2003), Fleurbaey, Hagneré, and Trannoy (2003), Glewwe (1991), and Lewbel (1989).
This page intentionally left blank.
This page intentionally left blank.
In what follows in this book, we will denote living standards by the variable y. The indices we will use will sometimes require these living standards to be strictly positive, and, for expositional simplicity, we may assume that this is always the case. Strictly positive values of y are required, for instance, for the Watts poverty index and for many of the decomposable inequality indices. It is of course reasonable to expect indicators of living standards such as consumption or expenditures to be strictly positive. This assumption is less natural for other indicators, such as income, for which capital losses or retrospective tax payments can generate negative values. Also recall that, for expositional simplicity, we will also usually refer to living standards as incomes.
Let p = F(y) be the proportion of individuals in the population who enjoy a level of income that is less than or equal to y. F(y) is called the cumulative distribution function (cdf) of the distribution of income; it is nondecreasing in y, and varies between 0 and 1, with F(0) = 0 and F(∞)=11. For expositional simplicity, we will sometimes implicitly assume that F(y) is continuously differentiable and strictly increasing in y. These are reasonable approximations for largepopulation distributions of income. They are also reasonable assumptions from the point of view of describing the data generating processes that generate the distributions of income observed in practice. The density function, which is the firstorder derivative of the cdf, is denoted as f(y) = F'(y) and is strictly positive when F(y) is assumed to be strictly increasing in y.
1DAD: DistributionDistribution Function.
A useful tool throughout the book will be "quantile functions". The use of quantiles will help simplify greatly the exposition and the computation of several distributive measures. Quantiles will also sometimes serve as direct tools to analyze and compare distributions of living standards (to check firstorder dominance for instance). The quantile function Q(p) is defined implicitly as F(Q(p)) = p, or using the inverse distribution function, as Q(p)= F(−1(p)2. Q(p) is thus the living standard level below which we find a proportion p of the population. Alternatively, it is the income of that individual whose rank — or percentile — in the distribution is p. A proportion p of the population is poorer than he is; a proportion 1 – p is richer than him.
These tools are illustrated in Figure 3.1. The horizontal axis shows percentiles p of the population. The quantiles Q(p) that correspond to different p values are shown on the vertical axis. The larger the rank p, the higher the corresponding income Q(p). Alternatively, incomes y appear on the vertical axis of Figure 3.1, and the proportions of individuals whose income is below or equal to those y are shown on the horizontal axis. At the maximum income level, ymax, that proportion F(ymax) equals 1. The median is given by Q(0.5), which is the income value which splits the distribution exactly in two halves.
Note that an important expositional advantage of working with quantiles is to normalize the population size to 1. This also means that everyone's income and contribution to this book's poverty and equity analysis can then appear on an interval of percentiles ranging from 0 to 1. In a sense, the population size is thus scaled to that of a socially representative individual. Normalizing all population sizes to 1 also makes comparisons of poverty and equity accord with the population invariance principle. This principle says that adding an exact replicate of a population to that same population should not change the value of its distributive indices. Putting everyone's income within a common total population scale of 1 is a handy descriptive way of comparing populations of different sizes. It also ensures that adding exact replications to these populations will not change the distributive picture.
We will define most of the distributive measures (indices and curves) in terms of integrals over a range of percentiles. This is a familiar procedure in the context of continuous distributions. We will see below why this procedure is also generally valid in the context of discrete distributions, even though the use of summation signs is often more familiar in that context. Using integrals will make the definitions and the exposition simpler, and will help focus on what matters more, namely, the interpretation and the use of the various measures.
2DAD: CurvesQuantile.
The most common summary index of a distribution is its mean. Using integrals and quantiles, it is defined simply as:
μ is therefore the area underneath the quantile curve. This corresponds to the grey area shown on Figure 3.1. Since the horizontal axis varies uniformly from 0 to 1, μ is also the average height of the quantile curve Q(p), and this is given by μ on the vertical axis, μ is thus the income of the population's "average individual".
The computation of the average income μ gives equal weight to all incomes in the population. We will see later in the book alternative weighting schemes for computing socially representative incomes. As for most distributions of income, the one shown on Figure 3.1 is skewed to the right, which gives rise to a mean μ that exceeds the median Q(p). Said differently, the proportion of individuals whose income falls underneath the mean, F(μ), exceeds one half.
To see how to rewrite the above definitions using familiar summation signs for discrete distributions, we need a little more notation. Say that we are interested in a distribution of n incomes. We first order the n observations of yi in increasing values of y, such that y1≤ y2≤ y3≤ … ≤ yn−1≤ yn. We then associate to these n discrete quantiles over the interval of p between 0 and 1. For p such that (i − 1)/n < p ≤ i/n, we then have Q(p) = yi. Technically, this is equivalent to defining quantiles as Q(p) = min{yF(y)≥ p}. This is illustrated in Table 3.1 for n = 3 and where the three income values are 10, 20 and 30. Figure 3.2 graphs those quantiles as a function of p. p values between 0 and 1/3 give a quantile of 10, the second income, 20, covers percentile 1/3 to 2/3, and the highest incomes, 30, covers percentile 2/3 to 1.
The formulae for discrete distributions are then computed in practice by replacing the integral sign in the continuous case by a summation sign, by summing across all quantiles, and by dividing that sum by the number of observations n. Thus, the mean μ of a discrete distribution can be expressed as:
Thus, whenever an expression like (3.1) arises, we can think of the integral sign as standing for a summation sign and of dp as standing for 1/n.
Using (3.2), the mean of the discrete distribution of Table 3.1, which is 20, is then simply the integral of the quantile curve shown on Figure 3.2. In other words, it is the sum of the area of the three boxes each of length 1/3 that can be found underneath the filled curve. For completeness, we will mention from time to time how indices and curves can be estimated using the more familiar summation signs. For more information, the reader can also consult DAD's User Guide, where the estimation formulae shown use summation signs and thus apply to discrete distributions.
I 
i/n 
Q(i/n) = yi 
1 
0.33 
10 
2 
0.66 
20 
3 
1 
30 
For poverty comparisons, we will also need the concept of quantiles censored at a poverty line z. These are denoted by Q*(p; z) and defined as:
Censored quantiles are therefore just the incomes Q(p) for those in poverty (below z) and z for those whose income exceeds the poverty line. This is illustrated on Figure 3.3, which is similar to Figure 3.1. Quantiles Q(p) and censored quantiles Q*(p; z) are identical up to p = F(z), or up to Q(p) = z. After this point, censored quantiles equal a constant z and therefore diverge from the quantiles Q(p).
The mean of the censored quantiles is denoted as μ* (z):
This is the area underneath the curve of censored incomes Q* (p; z). Censoring income at z helps focus attention on poverty, since the precise value of those living standards that exceed z is irrelevant for poverty analysis and poverty comparisons (at least so long as we consider absolute poverty).
The poverty gap at percentile p, g(p; z), is the difference between the poverty line and the censored quantile at p, or, equivalently, the shortfall (when applicable) of living standard Q(p) from the poverty line. Let f+ = max (f, 0).
E:18.7.6
Poverty gaps can then be defined as3:
When income at p exceeds the poverty line, the poverty gap equals zero. A shortfall g(q; z) at rank q is shown on Figure 3.3 by the distance between z and Q(q). The larger one's rank p in the distribution — the higher up in the distribution of income — the lower the poverty gap g(p; z). The proportion of individuals with a positive poverty gap is given by F(z). The average poverty gap then equals μg(z):
μg(z) is then the size of the area in grey shown on Figure 3.3.
There are two types of poverty and equity comparisons: cardinal and ordinal ones. Cardinal comparisons involve comparing numerical estimates of poverty and equity indices. Ordinal comparisons rank broadly poverty and equity across distributions, without attempting to quantify the precise differences in poverty and equity that exist between these distributions. They can often say where poverty and equity is larger or smaller, but not by how much.
Consider for instance the case of cardinal poverty comparisons. Numerical poverty estimates attach a single number to the extent of aggregate poverty in a population, e.g., 40% or $200 per capita. But calculating cardinal poverty estimates requires making a number of very specific assumptions. These include, inter alia, assumptions on the form of the poverty index, the definition of the indicator of wellbeing, the choice of equivalence scales, the value of the poverty line, and how that poverty line varies precisely across space and time.
Once these assumptions are made, cardinal poverty estimates can tell, for instance, that the consumption expenditures of 30% of the individuals in a population lie underneath a poverty line, but that a proposed government program could decrease that proportion to 25%. Cardinal poverty estimates can also be used to carry out a moneymetric costbenefit analysis of the effects of social programs. Thus, if the above government program involved yearly expenditures of $500 million, then we would know immediately that a 1% fall in the proportion of the poor would cost on average $100 million to the government. That amount could then be compared to the poverty alleviation cost of other forms of government policy.
The main advantage of cardinal estimates of poverty and equity is their ease of communication, their ease of manipulation, and their (apparent) lack of
3DAD: CurvesPoverty gap.
ambiguity. Government officials and the media often want the results of distributive comparisons to be produced in straightforward and seemingly precise terms, and will often feel annoyed when this is not possible. As hinted above, cardinal estimates of poverty and equity are, however, necessarily (and often highly) sensitive to the choice of a number of arbitrary measurement assumptions.
It is clear, for example, that choosing a different poverty line will almost always change the estimated numerical value of any index of poverty. The elasticity of the poverty headcount index with respect to the poverty line is, for example, often significantly larger than 1 (see Section 12.2). This implies that a variation of 10% in the poverty line will then change by more than 10% the estimated proportion of the poor in the population; this sensitivity is substantial, especially since poverty lines are rarely convincingly bounded within a narrow interval.
Another source of cardinal variability comes from the choice of the form of a distributive index. Many procedures have been proposed for instance to aggregate individual poverty. Depending on the chosen procedure, numerical estimates of aggregate poverty will end up larger or lower. As we will see later, for instance, identifying a "socially representative poverty gap" will hinge particularly on the relative weight given to the more deprived among the poor. There is little objective guidance in choosing that weight; the greater its value, however, the greater the socially representative poverty gap, and the greater the estimate of aggregate poverty.
Ordinal comparisons, on the other hand, do not attach a precise numerical value to the extent of poverty or equity, but only try to rank poverty and equity across all indices that obey some generallydefined normative (or ethical) principles. This can be useful when it suffices to know which of two policies will better alleviate poverty, or which of two distributions has more inequality, but not precisely by how much. Because of this lower information requirement, ordinal rankings can prove robust to the choice of a number of measurement assumptions. For instance, ordinal poverty orderings can often rank poverty over general classes of possible poverty indices and wide ranges of possible poverty lines.
It is thus useful to consider in turn cardinal and ordinal comparisons of poverty and equity. We first see how to construct aggregate cardinal distributive indices. Ordinal comparisons are considered in Part III.
This page intentionally left blank.
The Lorenz curve has been for several decades the most popular graphical tool for visualizing and comparing income inequality. As we will see, it provides complete information on the whole distribution of incomes relative to the mean. It therefore gives a more comprehensive description of relative incomes than any one of the traditional summary statistics of dispersion can give, and it is also a better starting point when looking at income inequality than the computation of the many inequality indices that have been proposed. As we will see, its popularity also comes from its usefulness in establishing orderings of distributions in terms of inequality, orderings that can then be said to be "ethically robust".
The Lorenz curve is defined as follows1:
The numerator sums the incomes of the bottom p proportion (the poorest 100p%) of the population. The denominator sums the incomes of all. L(p) thus indicates the cumulative percentage of total income held by a cumulative proportion p of the population, when individuals are ordered in increasing income values. For instance, if L(0.5) = 0.3, then we know that the 50% poorest individuals hold 30% of the total income in the population.
1DAD: CurvesLorenz.
A discrete formulation of the Lorenz curve is easily provided. Recall that the discrete income values yi are ordered such that y1 ≤ y2 ≤... ≤ yn, with percentiles pi = i/n such that Q(pi) = yi. For i = 1,...n, the discrete Lorenz curve is then defined as:
If needed, other values of L(p) in (4.2) can be obtained by interpolation.
The Lorenz curve has several interesting properties. As shown in Figure 4.1, it ranges from L(0) = 0 to L(1) = 1, since a proportion p = 0 of the population necessarily holds a proportion of 0% of total income, and since a proportion p = 1 of the population must hold 100% of aggregate income. L(p) is increasing as p increases, since more and more incomes are then added up. This is also seen by the fact that the derivative of L(p) equals Q(p)/μ:
This is positive if incomes are positive, as we are assuming throughout. Hence by observing the slope of the Lorenz curve at a particular value of p, we also know the pquantile relative to the mean, or, in other words, the income of an individual at rank pas a proportion of mean income. An example of this can be seen on Figure 4.1 for p = 0.5. The slope of L(p) at that point is Q(0.5)/μ, the ratio of the median to the mean. The slope of L(p) thus portrays the whole distribution of meannormalized incomes.
The Lorenz curve is also convex in p, since as p increases, the new incomes that are being added up are greater than those that have already been counted. This is clear from equation (4.3) since Q(p) is increasing in p. Mathematically, a curve is convex when its second derivative is positive, and the more positive that second derivative, the more convex is the curve. Formally, the secondorder derivative of the Lorenz curve equals
Note that by definition that p ≡ F(Q(p)). Differentiating this identity with respect to p, we have that 1 ≡ f(Q(p)) d(Q(p))/dp. Thus,
and we therefore have that
The larger the density of income f(Q(p)) at a quantile Q(p), the less convex the Lorenz curve at L(p). The convexity of the Lorenz curve is thus revealing of the density of incomes at various percentiles. On Figure 4.1, this density is thus visibly larger for lower values of p since this is where the slope of the L(p) changes less rapidly as p increases.
Some measures of central tendency can also be identified by a look at the Lorenz curve. In particular, the median (as a proportion of the mean) is given by Q(0.5)/μ, and thus, as mentioned above, by the slope of the Lorenz curve at p = 0.5. Since many distributions of incomes are skewed to the right, the mean often exceeds the median and Q(p = 0.5)/μ will typically be less than one. The mean income in the population is found at that percentile at which the slope of L(p) equals 1, that is, where Q(p) = μ and thus at percentile F(μ) (as shown on Figure 4.1). Again, this percentile will often be larger than 0.5, the median income's percentile. The percentile of the mode (or modes) is where L(p) is least convex, since by equation (4.4) this is where the density f(Q(p)) is highest.
Simple summary measures of inequality can readily be obtained from the graph of a Lorenz curve. The share in total income of the bottom p proportion of the population is given by L(p); the greater that share, the more equal is the distribution of income. Analogously, the share in total income of the richest p proportion of the population is given by 1 – L(p); the greater that share, the more unequal is the distribution of income. These two simple indices of inequality are often used in the literature.
An interesting but less wellknown index of inequality is given by the proportion of total income that would need to be reallocated across the population to achieve perfect equality in income. This proportion is given by the maximum value of p – L(p), which is attained where the slope of L(p) is 1 (i.e., at L(p = F(μ))). It is therefore equal to F(μ) – L(F(μ)). This index is usually called the Schutz coefficient.
Meanpreserving equalizing transfers of income are often called PigouDalton transfers. In moneymetric terms, they involve a marginal transfer of $1, say, from a richer person (of percentile r, say) to a poorer person (of percentile q < r)that keeps total income constant. All indices of inequality which do not increase (and sometimes fall) following any such equalizing transfers are said to obey the PigouDalton principle of transfers. These equalizing transfers also have the consequence of moving the Lorenz curve unambiguously closer to the line of perfect equality. This is because such transfers do not affect the value of L(p) for all p up to q and for all p greater than r, but they increase L(p) for all p between q and r.
Hence, let the Lorenz curve LB(P) of a distribution B be everywhere above the Lorenz curve LA(P) of a distribution A. We can think of B as having been obtained A through a series of equalizing PigouDalton transfers applied to an initial distribution A. Hence, inequality indices which obey the principle of transfers will unambiguously indicate more inequality in A than in B. We will come back to this important result in Chapter 11 when we discuss how to make ethically robust comparisons of inequality.
If all had the same income, the cumulative percentage of total income held by any bottom proportion p of the population would also be p. The Lorenz curve would then be L(p) = p: population shares and shares of total income would be identical. A useful informational content of a Lorenz curve is thus its distance, p – L(p), from the line of perfect equality in income. Compared to perfect equality, inequality removes a proportion p – L(p) of total income from the bottom 100 .p % of the population. The larger that "deficit", the larger the inequality of income.
If we were then to aggregate that deficit between population shares and income shares in income across all values of p between 0 and 1, we would get half the wellknown Gini index2:
The Gini index implicitly assumes that all "share deficits" across p are equally important. It thus computes the average distance between cumulated population shares and cumulated income shares.
One can, however, also think of other weights to aggregate the distance p–L(p). The class of linear inequality indices is given by applying percentiledependent weights to those distances. Let those weights be defined by κ(P). A popular oneparameter functional specification for such weights is given by
and depends on the value of a single "ethical" parameter ρ That parameter must be greater than 1 for the weights κ(P; p) to be positive everywhere. The shape of κ(p;ρ) is shown on Figure 4.2 for values of ρ equal to 1.5, 2 and 3. The larger the value of ρ the larger the value of κ(P;ρ) for small p.
2 DAD: Inequality  Gini/SGini Index.
Using (4.8) then gives what is called the class of SGini (or "SingleParameter" Gini) inequality indices, I(ρ)3:
E:18.8.2
Note4 that I(2) is the standard Gini index. This is because κ(p;ρ = 2) ≡ 2, which then gives equal weight to all distances p – L(p). When 1 < ρ < 2, relatively more weight is given to the distances occurring at larger values of p, as shown by Figure 4.2. Conversely, when ρ > 2, relatively more weight is given to the distances found at lower values of p. Changing ρ thus changes the "ethical" concern which is felt for the "share deficits" at various cumulative proportions of the population.
Let ω(p;ρ) be defined as
The shape of ω(p; ρ) is shown on Figure 4.3 for ρ equal to 1.5, 2 and 3. Note that ω(p; ρ) > 0 and that dω(p;ρ)/dp < 0 when ρ > 1. Since for any value of ρ the area under each of the three curves on Figure 4.3 equals 1 too. Using (4.10) and integrating by parts equation (4.9), we can then show that5:
E:18.8.31
This says that I(p) weights deviations of incomes from the mean by weights which fall with the ranks of individuals in the population. Since, in equation (4.11), I(p) is a (piecewise) linear function of the incomes Q(p), it is a member of the class of linear inequality measures, a feature which will prove useful later in measuring progressivity and vertical equity. The usual Gini index is then given simply by:
Yaari (1988) defines "an indicator for the policy maker's degree of equality mindedness at p" as –ω(1)(p;ρ)/ω(p;ρ), where ω(1)(p;ρ) is the firstorder
3DAD: InequalityGini/SGini Index.
4DAD: CurvesLorenz.
5DAD: InequalityGini/SGini Index.
derivative of ω(p; ρ) with respect to p. This indicator thus captures the speed at which the weights ω(p;ρ) decrease with the ranks p. It gives:
Thus, the local degree of "equality mindedness" for ω(p;ρ) is a proportional function of the single parameter ρ. As definition (4.13) makes clear, this degree of inequality aversion is defined at a particular rank p in the distribution of income, independently of the precise value that income takes at that rank. The larger the value of ρ, the larger the local degree of equality mindedness, and the faster the fall of the weights ω(p;ρ) with an increase in the rank p. Therefore, the greater the value of ρ, the more sensitive is the social decisionmaker to differences in ranks when it comes to granting ethical weights to individuals.
The functions κ(p;ρ) and ω(p;ρ) can also be given an interpretation in terms of densities of the poor. Assume that r individuals are randomly selected from the population. The probability that the income of all of these r individuals will exceed Q(p) is given by [1  F(Q(p))] r. The probability of finding an income below Q(p) in such samples is then 1  [1  F(Q(p))] r = 1  [1  p] r. 1  [1  p] r is thus the distribution function of the lowest income in samples of r individuals. The density of the lowest income rank in a sample of r randomly selected incomes is the derivative of that distribution with respect to p, which is
This helps interpret the weights κ(p;ρ) and ω(p;ρ). By equation (4.8), κ(P; ρ) is ρ times the density of the lowest income in a sample of ρ 1 randomly selected individuals; analogously, by equation (4.10), ω(p;ρ) is the density of the lowest income in a sample of ρ randomly selected individuals.
We might be interested in determining the impact of some inequalitychanging process on the inequality indices of type (4.11). One such process that can be handled nicely spreads income away from the mean by a proportional factor λ, and thus corresponds to some form of bipolarization of incomes away from the mean (loosely speaking). This bipolarization process is equivalent to adding (λ  1)(Q(p)  μ)to Q(p), since
does indeed spread income away from the mean by a proportional factor λ. As can be checked from equation (4.11), this changes I(p) proportionally by λ:
Equation (4.16) also says that the elasticity of I(p) with respect to λ, when λ equals 1 initially, is equal to 1 whatever the value of the parameter
Such bipolarization away from the mean is also equivalent to a process that increases the distance p  L(p) by a factor λ. That this gives the same change in I(ρ) can be checked from equation (4.9). This bipolarization process thus increases the deficit p  L(p) between population shares p and income shares L (p) by a constant factor λ across all p. We will see later in Chapter 12 how this distanceincreasing process leads to a nice illustration of the possible impact of changes in inequality on poverty.
As shown on Figure 4.3 and in equation 4.11, the larger the value of ρ, the greater the weight given to the deviation of low incomes from the mean. When ρ becomes very large, the index I(ρ) equals the proportional deviation from the mean of the lowest income. When ρ = 1, the same weight ω(p; ρ = 1) ≡ 1 is given to all deviations from the mean, which then makes the inequality index I(ρ = 1) always equal to 0, regardless of the income distribution under consideration. Thus, SGini indices range between 0 (when all incomes are equal to the mean or when the ethical parameter ρ is set to 1) and 1 (when total income is concentrated in the hands of only one individual, or when ρ is large and the lowest income is close to 0). Since the Lorenz curve moves towards p when a PigouDalton equalizing transfer is implemented, the value of the SGini indices also naturally decreases with such transfers.
Hence, ρ is a parameter of "inequality aversion" that captures our concern for the deviation of quantiles from the mean at various ranks in the population. In this sense, it is analogous to the parameter ε of relative inequality aversion which we will discuss below in the context of the Atkinson indices. For the standard Gini index of inequality, we have that ρ = 2 and thus that ω(p;ρ = 2) = 2(1p); hence in assessing the standard Gini, the weight on the deviation of one's income from the mean decreases linearly with one's rank in the distribution of income. In a discrete formulation, the weights ω(p; ρ) take the form of:
The SGini indices can also be shown to be equal to the covariance formula
a formula which can simplify their computation with common spreadsheet or statistical softwares. The traditional Gini is then simply:
and is just a proportion of the covariance between incomes and their ranks. Note here the interesting analogy of (4.19) with the variance, given by
A further useful interpretive property of the standard Gini index is that it equals half the meannormalized average distance between all incomes:
Thus, if we find that the Gini index of an income distribution equals 0.4, then we know that the average distance between the incomes of that distribution is of the order of 80% of the mean. Again, note the interesting link of (4.21) with another definition of the variance, which is var
The Gini index can also be computed as the integral of a simple transformation of the familiar cumulative distribution function. Recall that F(y) and 1  F(y) are simply the proportions of individuals with incomes below and above y. If we integrate the product of these proportions across all possible values of y, we again obtain the Gini coefficient:
Note also that F(y)(1 – F(y)) is largest at F(y) = 0.5, which also explains why the Gini index is often said to be most sensitive to changes in incomes occurring around the median income.
Now suppose that society can be split into two classes, and that income is equally distributed within each class.
1 Assume that those in the first class hold no income. The Gini index of the total population is then given by the population share of that zeroincome class.
2 Assume that the population share of each group is 0.5. The Gini index of the total population is then given by 0.5 – L(0.5). In other words, the income share of the bottom class is 0.5 minus the Gini coefficient.
3 Assume that the population share of each group is again 0.5. Denote the incomes of those in the richer class by yR and of those in the poorer class by yp We then have:
or alternatively
which gives a simple relationship between incomes and the Gini coefficient. For instance, if yR = λyP, then the Gini index is simply (λ – 1)/(2λ + 2); for λ = 2, we thus have I(ρ = 2) = 1/6.
A final interesting interpretation of the Gini index is in terms of relative deprivation, which has been linked in the sociological and psychological literature to subjective wellbeing, social protest and political unrest. Runciman (1966) defines it as follows:
The magnitude of a relative deprivation is the extent of the difference between the desired situation and that of the person desiring it (as he sees it), (p.10)
Sen (1973), Yitzhaki (1979) and Hey and Lambert (1980) follow Runciman's lead to propose for each individual an indicator of relative deprivation that measures the distance between his income and the income of all those relative to whom he feels deprived. For instance, let the relative deprivation of an individual with income Q(p), when comparing himself to another individual with income Q(q), be given by:
The expected relative deprivation of an individual at rank p is then 6:
As we did for the "shares deficits" above, we can aggregate the relative deprivation at every percentile p by applying the weights κ(p;ρ). We can show that this gives the SGini indices of inequality:
6DAD: CurvesRelative Deprivation.
Hence, the SGini indices are also a weighted average of the average relative deprivation felt in a population. By equations (4.8), (4.14) and (4.27), they equal the expected relative deprivation of the poorest individual in a sample of ρ – 1 randomly selected individuals. The greater the value of ρ the more important is the relative deprivation of the poorer in computing I(ρ).
We now introduce the concept of a social welfare function. Unlike relative inequality, which considers incomes relative to the mean, social welfare aggregates absolute incomes. We will see that under some popular conditions on the shape of social welfare functions, the measurement of inequality and social welfare can often be nicely linked and integrated, and that the tools used for the two concepts are then similar. This will explain why some inequality indices are sometimes called "normative".
The social welfare functions we consider take the form of:
where for expositional simplicity we restrict ω(p) to be of the special form ω(p;ρ) defined by equation (4.10). U(Q(p)) is a "utility function" of income Q(p). Social welfare is then the expected utility of the poorest individual in a sample of (ρ  1) individuals.
Another requirement that we wish to impose on the form of W is that it be homothetic. Homotheticity of W is analogous to the requirement for consumer utility functions that the expenditure shares of the different consumption goods be constant as income increases, or the requirement for production functions that the ratios of the marginal products of inputs stay constant as output is increased. For social welfare measurement, homotheticity implies that the ratio of the marginal social utilities (the marginal utility being given by U'(Q(p)) ω(p)) of any two individuals in a population stays the same when all incomes are changed by the same proportion. For (4.28) to be homothetic, we need U(Q(p)) to take the popular form of U(Q(p); ε), which is defined as
Hence, W in equation (4.28) will depend on the parameters ρ and on ε and we will denote this as W(ρ,ε)7:
Homotheticity of a social welfare function has an important advantage: the social welfare function can then easily be used to measure relative inequality. To see how this can be done, define ξ(ρ, ε) as the equally distributed income that is equivalent, in terms of social welfare, to the actual distribution of income. We will refer to ξ as the EDE income, the equally distributed equivalent income. ξ(ρ,ε) is implicitly defined as:
Since is also such that
or, alternatively,
where is the inverse utility function:
The index of inequality I corresponding to the social welfare function W is then defined as the distance between the EDE and the mean incomes, expressed as a proportion of mean income:
Using ξ(ρ,ε) in (4.35) gives I(ρ, ε): I(ρ, ε) = 1  ξ(ρ,ε)/μ 8.
Clearly, then, the EDE income is a simple function of average income and inequality, with
7DAD: WelfareSGini Index.
8DAD: InequalityAtkinsonGini Index.
Compared to W, ξ has the advantage of being money metric and thus of being easily interpreted. It can, for instance, be compared to other economic indicators that are also expressed in moneymetric terms.
To increase social welfare, we can either increase μ or we can increase equality of income 1  I by decreasing inequality I. Two distributions of income can display the same social welfare even with different average incomes if these differences are offset by differences in inequality. This is shown in Figure 4.4, starting initially with two different levels of mean income μ0 and μ1 and common zero inequality. We then have that ξ = μ0 and ξ = μ1 To preserve the same level of social welfare in the presence of inequality, mean income must be higher: this is shown by the positive slope of the constant ξ functions. Furthermore, as inequality becomes larger, further increases in I must be matched by higher and higher increases in mean income for social welfare not to fall.
Defined as in (4.35), inequality has an interesting interpretation: it measures the difference between
the mean level of actual income
and the (lower) level that would instead be needed to achieve the same level of social welfare were income distributed equally across the population.
This difference being expressed as a proportion of mean income, I thus shows the per capita proportion of income that is "wasted" in social welfare terms because of its unequal distribution. Society as a whole would be just as welloff with an equal distribution of a proportion of just 1  I of total actual income. I can thus be interpreted as a unitfree indicator of the social cost of inequality.
Let a distribution B of income be a proportional rescaling of a distribution A. In other words, for a constant λ > 0, let QB(P) = λQA(P) for all p. If the social welfare function used for the computation of I is homothetic, it must be that IA = IB This is illustrated in Figure 4.5 for the case of two incomes and for an initial distribution A, and two incomes and for a "scaledup" distribution B (since λ > 1). Social welfare in A is given by WA. The social indifference curve WA shown in Figure 4.5 also depicts the many other combinations of incomes that would yield the same level of social welfare. The combinations at point F correspond to a situation of equality of income where both individuals enjoy ξA.ξA is therefore the equally distributed income that is socially equivalent to the distribution (, ).
The average income in A is given by μA, which leads to point G = (μA,μA)in Figure 4.5. Hence two distributions of income, one made of the vector (, ) and the other of the vector (ξA, ξA), generate the same level WA of social welfare, the first with an unequally distributed average income μA and the other with an equally distributed average income ξA. Hence, the vertical (or horizontal) distance between point F and point G in Figure 4.5 can be understood as the "cost of inequality" in A's distribution of income. Taking that distance as a proportion of μA (see equation (4.35)) gives the index of inequality IA
That for the same λ can be seen from the fact that the two vectors of income lie along the same ray from the origin. If the function W is homothetic, then inequality in A must be the same as inequality in B. In other words, the distance between points D and E as a proportion of the distance OE must be the same as the distance between points F and G as a proportion of the distance OG.
Two special cases of W(ρ,ε) are of particular interest in assessing social welfare and relative inequality. The first is when income ranks are not important per se in computing social welfare: this is obtained with ρ = 1, and it yields the wellknown Atkinson additive social welfare function, W(ε)9
This Atkinson social welfareatk function has had two major interpretations: 1) first, as a utilitarian social welfare function, where U(Q(p);ε) is an individual utility function displaying decreasing marginal utilities of income, and 2) second, as a concave social evaluation of a concave individual utility of income.
It can be argued, however, that "it is fairly restrictive to think of social welfare as a sum of individual welfare components", and that one might feel that "the social value of the welfare of individuals should depend crucially on the levels of welfare (or incomes) of others" (Sen 1973, pp.30 and 41). The unrestricted form W(ρ,ε) allows for such interdependence and may therefore be thought more flexible than the Atkinson additive formulation. In the light of the above, we can indeed interpret W(ρ,ε) as the expected utility of the poorest individual in a group of ρ randomly selected individuals, or the expected social valuation of the utility of such individuals. This interpretation of the social evaluation function W(ρ,ε) confirms why it is not additive or separable in individual welfare: the social welfare weight on U(Q(p);ε) depends on the rank p of the individual in the whole distribution of income. It is only when ε = 1 that W(ρ,ε) gives the average utility U(Q(p);ε) weighted by a function of ranks.
9DAD: WelfareAtkinson Index.
Figure 4.6 shows the shape of the utility functions U(y,ε) for different values of ε10 Incomes are shown on the horizontal axis as a proportion of their mean, and utility U(y;ε) can be read on the vertical axis. The normalization U(μ;ε) = 1 has been applied for graphical convenience. Although for all values of ε the slope of U(y;ε) is positive, that slope is not constant. This is made more explicit on Figure 4.7 which shows the marginal social utility of income U(1)(y;ε) for different values of ε. Again, a normalization of U(1)(μ;ε) = 1 is applied. For ε = 0, the marginal social utility is constant: increasing by a given amount a poor person's income has the same social welfare impact as increasing by the same amount a richer person's income. For ε > 0, however, increasing the poor's income is socially more desirable than increasing the rich's. The larger the value of ε, the faster marginal social utility falls with y.
By (4.33) and (4.35), the Atkinson inequality index is then given by11:
The Atkinson indices are said to exhibit constant relative inequality aversion since the elasticity of U(l)(Q(p);ε) with respect to Q(p) is constant and equal to ε:
The parameter ε is thus usually called the Atkinson parameter of relative inequality aversion.
Figure 4.8 illustrates graphically the link between the Atkinson social evaluation functions W(ε) and their associated inequality indices. For this, suppose a population of only two individuals, with incomes y1 and y2 as shown on the horizontal axis. Mean income is given by μ = (y1 + y2)/2 (the middle point between y1 and y2). The utility function U(y;ε) has a positive but decreasing slope. W(ε) is then given by (U(y1) + U(y2))/2, the average height of U(y1) and U(y2).
If equally distributed, a mean income of ξ would be sufficient to generate that same level of social welfare, since on Figure 4.8 we have that W(ε) = U(ξ;ε). The cost of inequality is thus given by the distance between μ and ξ, shown as C on Figure 4.8. Inequality is the ratio C/μ.
Graphically, the more "concave" the function U(y;ε), the greater the cost of inequality and the greater the inequality indices I(ε). This can be seen on
10This paragraph draws from Cowell (1995), pp.4041.
11DAD: InequalityAtkinson Index.
Figure 4.9 where two functions U(y;ε) have been drawn, with different relative inequality aversion parameters ε0 < ε1. We have that W(ξ) = U (ξ0; ε0) and W(ξ) = U(ξ1;ε1). The difference in relative inequality aversion parameters nevertheless leads to ξ0 > ξ1, and therefore to I(ε0) < I(ε1). A specification with greater inequality aversion leads to a greater inequality index, and to the judgement that inequality costs socially a greater proportion of average income.
The second special case of W(ρ,ε) is obtained when the utility functions U(Q (p);ε) are linear in the levels of income, and thus when ε = 0. This yields the class of SGini social welfaresgini functions, W(ρ)12:
Social welfare is then the expected income of the poorest individual in a group of ρ randomly selected individuals. By (4.33), this is also the EDE income. Hence, the associated inequality indices are given by:
which is seen by (4.11) to be the same as the SGini inequality indices I(ρ). Hence, social welfare and the EDE income equal per capita income corrected by the extent of relative deprivation in those incomes:
A useful curve for the analysis of the distribution of absolute incomes is the Generalized Lorenz curve. It is defined as GL(p)13:
and is illustrated on Figure 4.10. The Generalized Lorenz curve has all of the attributes of the Lorenz curve, except for the fact that it does not normalize
12DAD: WelfareSGini Index.
13DAD: CurvesGeneralized Lorenz.
incomes by their mean. GL(p) gives the absolute contribution to per capita income of the bottom p proportion (the 100p% poorest) of the population. GL(p) is thus also the per capita income that would be available if society could rely only on the income of the bottom p proportion of the population. Assume for instance that μ = $20000 and that GL(0.5) = $5000. Then, per capita income would be only $5000 if we assumed that the richest 50% of the population were suddenly to retire and earn no income... Note also that GL(p)/p gives the average income of the bottom p proportion of the population. In the example just provided, the average income of the 50% poorest would be $10,000, half the level of overall average income.
Combining (4.9), (4.35) and (4.40) further shows that the Generalized Lorenz curve has a nice graphical link to the SGini indices of social welfare:
A popular descriptive index of inequality is the quantile ratio. This is simply the ratio of two quantiles, Q(p2)/Q(p1) using percentiles p1 and p214. Popular values of p1 and p2 include p1 = 0.25 and p2 = 0.75 (the quartile ratio), as well as p1 = 0.10 and p2 = 0.90 (the decile ratio). Note that these values of p1 and p2 are often reversed. Median income is also a popular choice for Q(p1). Observe also that these ratios are by definition insensitive to changes that affect quantiles other than Q(p1) and Q(p2). Moreover, none of them is consistent with Lorenz inequality orderings: it can be that the Lorenz curve for a distribution A is always above that of distribution B, but that quantile ratios suggest that B has less inequality than A. For inequality analysis, an arguably better choice for normalizing Q(p2) is mean income — an index such as Q(p2)/μ can indeed be shown to be consistent with firstorder (restricted) inequality dominance (we discuss this in Chapter 11).
The coefficient of variation is the ratio of the standard deviation to the mean of income. It is given by15:
and is therefore a function of the squared distance between incomes and the mean.
14DAD: InequalityQuantiles Ratio.
15DAD: InequalityCoefficient of Variation.
Two other popular measures of inequality use distances in logarithms of income. The first one, which we can call the logarithmic variance, is defined as16
and the second, the variance of logarithms, as17
These two last measures do not, however, always obey the PigouDalton principle of transfers — that is, they will sometimes increase following a spreadreducing transfer of income between two individuals.
Finally, the relative mean deviation is the average absolute deviation from mean income, normalized by mean income18:
Note that this measure is insensitive to transfers made between individuals whose income lies on the same side of the mean.
A frequent goal is to explain the total amount of inequality in a distribution by the extent of inequality found among socioeconomic groups ("intra" or "within" group inequality) and across them ("inter" or "between" group inequality). There are several ways to do this. One method uses the class of inequality indices that are exactly decomposable into terms that account for within and betweengroups inequality. Although that class can be given a justification in terms of social welfare functions, this exercise is less transparent and intuitive than for the classes of relative inequality indices considered hitherto. Another method applies the Shapley decomposition to any type of inequality indices. We discuss these two methods in turn.
For most practical purposes, we can express these decomposable inequality indices as Generalized entropy indices. We denote them as I(θ)19:
16DAD: InequalityLogarithmic Variance.
17DAD: InequalityVariance of Logarithms.
18DAD: InequalityRelative Mean Deviation.
19DAD: InequalityEntropy Index.
Some special cases of (4.50) are worth noting. First, if we constrain θ to be no greater than 1 and let θ = 1  ε, I(θ) becomes ordinally equivalent to the family of Atkinson indices. This simply means that if an Atkinson index I(ε) indicates that there is more inequality in a distribution A than in a distribution B, then the index I(θ) with θ = 1  ε will also necessarily indicate more inequality in A than in B. Second, the special case I(θ = 0) gives the Mean Logarithmic Deviation, since I(θ = 0) can also be expressed as
that is, as the average deviation between the logarithm of the mean and the logarithms of incomes. I(θ = 1) gives the wellknown Theil index of inequality. I(θ = 2) is half the square of the coefficient of variation (see (4.46)) since I(θ = 2) can be rewritten as
Now assume that we can split the population into K mutually exclusive population subgroups, k = 1,...,K. The indices in (4.50) can then be decomposed as follows20:
where φ(k) is the proportion of the total population that belongs to subgroup k and μ(k) is the mean income of subgroup k.
I(k;θ) is inequality within subgroup k, defined in exactly the same way as in (4.50) for the total population. The first term in (4.53) can thus be interpreted as a weighted sum of the withingroup inequalities in the distribution of income.
20DAD: DecompositionEntropy: Decomposition by Groups.
is total population inequality when each individual in subgroup k is given the mean income μ,(k) of his subgroup (namely, when within subgroup inequality has been eliminated). I(θ) can thus be interpreted as the contribution of betweengroup inequality to total inequality.
Note, however, that only when θ = 0 is it the case that the withingroup inequality contributions do not depend on mean income in the groups; the terms I(k;θ = 0) are then strictly populationweighted. Otherwise, the withingroup inequalities are weighted by weights which depend on the mean income in the subgroups k. Depending on the context, this can make I(θ = 0) a more attractive decomposable index than for other values of θ.
This decomposition involves two steps. The first one is to decompose total inequality into global betweengroup and withingroup contributions. The second step is to the express global withingroup contribution as a sum of the withingroup contribution of each of the groups.
For each of these two steps, we want to assess by how much inequality would be reduced if we removed one of the "factors" that contribute to inequality. Take for instance the first step. It has two factors, withingroup and betweengroup inequality. By how much would inequality fall if between group inequality were eliminated? One estimate would be given by the difference between initial inequality and inequality after the mean income of all groups has been equalized. Another estimate would be given by the inequality that remains once withingroup inequality is removed and all that there is left is betweengroup inequality. These two estimates, however, will generally differ. Which one is better? Since there is no right answer to this question, an alternative is to use the average of the two estimates. Note that the first estimate gives the effect of the first factor when the second factor has not been removed, while the second estimate gives the effect of the first factor after the second factor has been eliminated.
Using the average marginal effect of removing a factor across all factor elimination sequences is what is implied by the choice of the Shapley value as a decomposition procedure. The procedure is detailed in an appendix found below in Section 4.7.
As mentioned above, applying the Shapley decomposition procedure to our subgroup inequality decomposition problem involves two steps. In the first step, we suppose that the two Shapley factors are betweengroup and withingroup inequality. The basic rules followed to compute the marginal contribution of each of these factors are:
1 first, to eliminate withingroup inequality and to calculate betweengroup inequality, we use a vector of incomes in which each observation is assigned the average income μ(k) of the observation's group k;
2 to eliminate betweengroup inequality and to calculate withingroup inequality, we use a vector of incomes where each observation has its income multiplied by the ratio μ(k)/μ of its group k.
To be more precise, let an inequality index I depend on the incomes of individuals in k = 1,..., K groups, each group with n(k) individuals. Let y(k) be the n(k)vector of incomes of group k. We want to express total inequality I as a sum of between and within group inequality21:
To compute the contribution of betweengroup inequality, we compute the fall of inequality observed when the mean incomes of the groups are equalized. This can be done either before or after withingroup inequality has been removed. Hence, the Shapley contribution of betweengroup inequality is given by:
where l(k) is a unit vector of size nk. The withingroup contribution is then given as
The second step consists in decomposing total withingroup inequality as a sum of withingroup inequality across groups. To do this, we proceed by replacing the incomes of those in a group k by μ(k) in order to eliminate group k's contribution to total withingroup inequality. The fall in inequality induced by this equalization of incomes is the contribution of group k to total withingroup inequality. We compute this for each group. Given that this computation depends on the sequence ordering of the groups, we compute the average contribution of a group k over all possible orderings of groups. This gives the Shapley value of group k's contribution to total withingroup inequality.
21DAD: DecompositionSGini: Decomposition by Groups.
To formalize this, suppose that there are only two groups, k = 1, 2. The first group's contribution to total withingroup inequality is given as
and symmetrically for the second group.
The Shapley value is a solution concept often employed in the theory of cooperative games. Consider a set S of s players who must divide some surplus among themselves. The question to resolve is: how can we divide the surplus between the s players? To see how, suppose that the s players can form coalitions (these coalitions are subsets of S) to extract a part of the surplus and redistribute it between their σ members. Suppose that the function V determines the extracting force of the coalition, viz, that amount of the surplus that it can extract without resorting to an agreement with those players that are outside of the coalition. The value of an additional player I in a coalition is given by
The term MV(,i) equals the marginal value added by player i after his adhesion to the coalition What will then be the expected marginal contribution of player i over the different possible coalitions that can be formed and which he can join? Note that the number of possible permutations of the s players equals s!. Note also that the size of coalitions is limited to σ ε {0, 1,...s  1}. Out of s! possible permutations of players, the number of times that the same first σ players are located in a same coalition is given by the number of possible permutations of the σ players in coalition that is, by σ!. For every permutation in the coalition we find (s – σ –1)! permutations for the players that complement the coalition (excluding player i). The Shapley value gives the expected marginal value that player i generates after his adhesion to a coalition of any possible size σ. It is thus given by:
This decomposition procedure has two useful properties. The first is symmetry, ensuring that the contribution of each factor is independent of the order in which it appears in the initial list or sequence of factors. The second property is exactness and additivity, from which the total surplus is given by
For decompositions of inequality or poverty indices, say, applying a Shapley procedure consists in computing the marginal effect on such indices of removing each contributing factor (between or within group inequality, inequality in income component, differences in mean income, etc.) in a given sequence of elimination. Repeating the computation for all possible elimination sequences, we estimate the mean of the marginal effects for each factor. This mean provides the contribution of each such factor. The contribution of all factors yield an exact, additive decomposition of distributive indices and variations in them into s contributions.
The literature on the measurement of inequality and social welfare is very large. General references include Atkinson (1983), Atkinson and Bourguignon (2000), Atkinson and Micklewright (1992), Bishop, Formby, and Smith (1993), Chakravarty (1990), Champernowne and Cowell (1998), Cowell (1995), Cowell (2000), Essama Nssah (2000), Foster and Sen (1997), Johnson and Shipp (1997), Lambert (2001), Sen (1973), Sen (1992), Sen (1992), and Saunders (1994).
Applications to real data are very numerous too — among the most influential recent ones feature Bourguignon and Morrisson (2002), Danziger and Gottschalk (1995), Gottschalk and Smeeding (1997), Gottschalk and Smeeding (2000), Jantti (1997) and Milanovic (2002).
Seminal work on inequality measurement and Lorenz curves include Atkinson (1970), Blackorby and Donaldson (1978), Dalton (1920), Dasgupta, Sen, and Starret (1973), Gini (1914) (see Gini 2005 for a recent English translation), Hainsworth (1964), Kakwani (1977a), Kolm (1969), Lorenz (1905) and Rothschild and Stiglitz (1973). Aaberge (2000) rationalizes the use of "moments of Lorenz curves" as measures of inequality, and Aaberge (2001a) presents axiomatic bases for the use of Lorenz curve orderings. Foster and Ok (1999) analyze the concordance of the variance of logarithms with Lorenz dominance.
Discussion and interpretation of linear (or rankdependent) indices of inequality can be found in Aaberge (1997), Aaberge (2000), Anand (1983), Barrett and Salles (1995), Ben Porath and Gilboa (1994), Blackburn (1989), Blackorby, Bossert, and Donaldson (1994), Bossert (1990), Chakravarty (1988), Chew and Epstein (1989), Donaldson and Weymark (1980) and Donaldson and Weymark (1983) (for SGinis), Duclos (1997a), Weymark (1981), Yaari (1988), Yitzhaki (1983) (for extended Ginis, equivalent to SGinis — see also Kakwani (1980)), and Wang and Tsui (2000). The most popular member of the class of linear inequality indices is the Gini index: it is discussed in detail in Deutsch and Silber (1997), Milanovic (1994b), Milanovic (1997), Subramanian (2002) and Yitzhaki (1998).
The theory and the economic measurement of relative deprivation is explored inter alia in Berrebi and Silber (1985), Chakravarty and Chakraborty (1984), Clark and Oswald (1996), Davis (1959), Duclos (2000), Ebert and Moyes (2000), Festinger (1954), Hey and Lambert (1980), Merton and Rossi (1957), Paul (1991), Podder (1996), Runciman (1966), Silver (1994), Wang and Tsui (2000), Yitzhaki (1979), Yitzhaki (1982a) and Nolan and Whelan (1996).
Discussion and use of the Theil index appears inter alia in Beblo and Knaus (2001), Duro and Esteban (1998) and Goerlich Gisbert (2001).
Other inequality indices are discussed in Araar and Duclos (2003) and Berrebi and Silber (1981) (a combination of Atkinson and Gini inequality indices), Chakravarty (2001) (a defense of the use of the variance), del Rio and Ruiz Castillo (2001) (for "intermediate inequality measures"), and Foster and Shneyerov (2000) (for "pathindependent decomposable measures").
Decomposition of inequality across population subgroups has also been the focus of a large literature. This has mostly involved using additive and Generalized entropy indices — see, for instance, Bourguignon (1979), Cowell (1980), Foster and Shneyerov (1999), Mookherjee and Shorrocks (1982), Shorrocks (1980), Shorrocks (1984), Schwarze (1996) and Zandvakili (1999). Decompositions of the Gini and rankdependent inequality indices are investigated in Dagum (1997), Deutsch and Silber (1999a), Deutsch and Silber (1999b), Milanovic and Yitzhaki (2002), Sastry and Kelkar (1994), Tsui (1998) and Yitzhaki and Lerman (1991). A moneymetric costofinequality approach to decomposing inequality across subpopulations is derived in Blackorby, Donaldson, and Auersperg (1981), Duclos and Lambert (2000) and Ebert (1999). Alternative decomposition approaches are also explored in Cowell and Jenkins (1995), Fields and Yoo (2000), Fournier (2001), Hyslop (2001), Jenkins (1995), Parker (1999), and Schultz (1998).
The Shapley value was introduced by Shapley (1953). See also Owen (1977) for how a twostage decomposition procedure can be applied to the Shapley value, as well as Shorrocks (1999) and Chantreuil and Trannoy (1999) for its use in distributive analysis.
Two approaches have been used to devise cardinal indices of poverty. The first uses the concept of equally distributed equivalent (EDE) incomes, and applies it to distributions whose incomes have been censored at the poverty line. It then compares those EDE incomes to the poverty line. The second approach transforms incomes and the poverty line into poverty gaps, and aggregates these gaps using socialwelfare like functions. We look at these two approaches in turn.
For the EDE approach to building poverty indices, we start with the distribution of income Q(p). Since, for poverty comparisons, we want to focus on those incomes that fall below the poverty line (the "focus axiom"), the incomes Q(p) are censored at the poverty line z to give Q*(p; z). The censored incomes are then aggregated using one of the many social welfare functions that have been proposed in the literature, such as the Atkinson or SGini ones. A poverty index is obtained by taking the difference between the poverty line and the EDE income. For instance, for the social welfare functions proposed in section 4.3, this procedure leads to the following class of poverty indices:
where ξ*(z; p, ε) is the EDE income of the distribution of censored income Q*(p; z) and where we need ρ ≥ 1 and ε ≥ 0 for the PigouDalton transfer principle not to be violated. P(z; ρ, ε) can then be interpreted as the "socially representative" or EDE poverty gap.
Examples of such poverty indices include a transformation of the Clark, Hemming and Ulph's (CHU) second class of poverty indices, given by P(z; ε) = P(z; ρ = 1, ε)1:
The CHU indices are then obviously closely related to the Atkinson social welfareatk functions and inequality indices. When ε = 1, the CHU poverty index is also the EDE poverty gap corresponding to the Watts poverty index, an index which is defined as2:
For 0 ≤ ε < 1, the CHU indices also correspond to the EDE poverty gap of the class of poverty indices proposed by Chakravarty, PC(z; ε):
Moreover, if we choose ε = 0 for the class of indices defined in (5.1), we obtain the class of SGini indices of poverty, P(z; ρ)3:
P(z; ρ = 2) is then a "Ginilike" index of poverty.
The second approach to constructing poverty indices uses the distribution of poverty gaps, g(p; z) = z  Q*(p; z). Once this distribution is known, no other use of the poverty line is needed for the aggregation of poverty. Because of this, the poverty gap approach to constructing poverty indices is slightly more restrictive and also puts more structure on the shape of the allowable poverty indices than the previous EDE approach. After the distribution of poverty gaps has been computed, we may use aggregating functions analogous to those used in Section 4.3 for the analysis of social welfare. Like social
1DAD: PovertyCHU Index.
2DAD: PovertyWatts Index.
3DAD: PovertySGini Index.
welfare functions, where we normally want an increase in someone's income to increase social welfare, we would normally wish the poverty indices to be increasing in poverty gaps. Unlike social welfare functions, however, where an equalizing PigouDalton transfer would often increase the value of a social welfare function, we would typically wish a poverty index to decrease when such an equalizing transfer of income takes place.
A popular class of poverty gap indices that can obey these axioms is known as the FosterGreerThorbecke (FGT) class. It differentiates its members using an ethical parameter α ≥ 0 and is generally defined as4
E: 18.7.4
for the normalized FGT poverty indices and as
for the unnormalized version (which can sometimes be more useful than the more usual normalized form). Note that poverty gap indices other than the FGT ones can also be easily proposed, simply by using other aggregating functions of poverty gaps that obey some of the desirable axioms (such as that of being increasing and convex in poverty gaps) discussed in the literature.
When α = 0, the FGT index gives the simplest and most commonly used poverty index. It is called the poverty headcount ratio, and is simply the proportion of a population that is in poverty (those with a positive poverty gap), F(z) 5. The shorter expression "poverty headcount" is sometimes meant to indicate the absolute (as opposed to the relative) number of the poor in the population. Since our population size is normalized ton 1 in the this book, we will use the two expressions "headcount" and "headcount ratio" interchangeably.
E:18.1.1
The next simplest and most commonly used index, μg(z), is given by the average poverty gap, P (z; α = 1), and is the average shortfall of income from the poverty line:
To see how to interpret the form of the FGT indices for general values of α consider Figure 5.1. It shows the (absolute) contributions to total poverty
4DAD: PovertyFGT Index.
5DAD: PovertyFGT Index.
(z; α) of individuals at different ranks p. These contributions are given by (g(p; z)/z)α. For α = 0, the contribution is a constant 1 for the poor and 0 for the rich (those whose rank exceeds F(z) on the Figure, or equivalently those whose income Q(p) exceeds z). The headcount is then the area covered by the dotted rectangle on Figure 5.1. For α = 1, the contribution of someone at p equals his normalized poverty gap, g(p; z)/z. Poverty is then the area underneath the g(p; z)/z curve drawn on Figure 5.1. The same reasoning is valid for higher values of α For instance, the absolute contribution to (z; α = 3) of individuals at rank p is given by (g(p; z)/z)3 on Figure 5.1, and (z; α = 3) equals the area underneath the (g(p; z)/z)3 curve.
Notwithstanding the above, interpreting the numerical value of FGT indices for α different from 0 and 1 can be problematic. We can easily understand what is meant by a proportion of the population in poverty or by an average poverty gap, but what, for instance, can a squaredpovertygap index actually signify? And how to explain it to a government Minister?... A further difficulty with such indices emerges from a closer look at Figure 5.1, which indicates that the absolute contribution of poverty gaps to poverty decreases with α — the contribution curves (g(p)/z)α move down as α rises. This also implies that the normalized FGT indices necessarily fall as α increases. This is paradoxical since it is usually argued that the higher the value of α, the greater the focus on those who suffer most "severely" from poverty. It would thus be more natural if an increase in α also increased (z; a).
One partial solution to these interpretive problems is to switch one's focus from the absolute to the relative contribution to an FGT index of individuals with different poverty gaps. Such a relative contribution is depicted on Figure 5.2 for α = 0, 1 and 2. It shows the ratio of the absolute contributions g(p)α to total poverty P(z; α) — these ratios are the same for normalized and unnormalized FGT indices. Since this graph shows relative contributions to total poverty, the area underneath each of the three curves must in all cases equal 1.
For α = 0, each poor contributes relatively the same constant 1/F(z) to the poverty headcount. The poor's relative contribution to the average poverty gap increases with their own poverty gap, as shown by the curve g(p)/P(z; α = 1). That relative contribution equals 1 for those individuals whose own poverty gap is precisely equal to the average poverty gap. The rank of such individuals is given by F(μg(z)), as is also shown on Figure 5.2. Thus, those located at p = F(μg(z)) have a poverty gap that is representative of the average poverty gap in the population. Increasing α from 1 to 2 decreases the relative contribution of the notsopoor, but inversely increases the contribution of those with the highest poverty gaps as shown by the curve g(p; z)/P(z; α = 2). This then becomes consistent with the general view that, in the aggregation of
individual poverty, higher values of α put more emphasis on those who suffer most severely from poverty — those with lower values of p and higher values of g(p;z).
Figure 5.2 does not, however, solve the main interpretation problems associated with the FGT indices. As mentioned above, explaining to nontechnicians or policymakers the practical meaning of FGT indices for general values of α is difficult since these indices are averages of powers of poverty gaps. They are also neither unitfree nor moneymetric (except for α = 0 and 1). An another alreadymentioned difficulty is that the usual FGT indices will generally fall with an increase in the value of their poverty aversion parameter, α.
A simple solution to these two problems is to transform the FGT indices into EDE poverty gaps. An EDE poverty gap is that poverty gap which — if it were assigned equally to all individuals — would yield the same aggregate poverty index as that which is currently observed. An EDE poverty gap can then usefully be interpreted as a sociallyrepresentative poverty gap. This transformation provides a moneymetric measure of poverty which can be usefully compared across different poverty indices and/or across different values of α. As we will see later, it also allows the analyst to determine the impact of povertygap inequality upon the level of poverty. For the unnormalized FGT indices, the EDE poverty gap is given simply by (for α > 0)6
For the normalized FGT indices, it is just ξ9 (z; a) = ξg (z; a)/z. An EDE poverty gap cannot be defined for α = 0.
Figure 5.3 shows such sociallyrepresentative poverty gaps ξg (z; α) for different values of α. In each case, we obtain a sociallyweighted moneymetric indicator of the distribution of deprivation in the population. This summary aggregate indicator can also be compared to the individual distribution of poverty, given by the g(p; z) curve. Those whose g(p; z) exceeds ξg (z; α) experience more poverty than the socially representative average. Those exactly at ξg (z; α) are located exactly at the socially representative poverty gap. Those representative individuals are thus found at the ranks given by F (ξg (z; α)), which are also shown on Figure 5.3 for different values of α.
An important point to note is that an increase in α moves the sociallyrepresentative poverty gap closer to that experienced by the poorest individuals. This is since ξg(z; α + 1) ≥ ξg(z; α) for any α > 0. (This is unlike the usual definition of the FGT indices, for which we have P (z; α + 1) ≤ P(z;α)
6DAD: PovertyFGT Index.
for any α > 0.) Hence, we can readily interpret increases in α as leading to increases in the sociallyrepresentative poverty gap, and thus in the relative weight given to the poorer of the poor. The larger the value of α, the more important are the most severe cases of deprivation in computing a sociallyrepresentative aggregate level of poverty.
Note finally that, besides being already in an EDE poverty gap form, the SGini index of poverty also has the property of being a poverty gap index. Indeed, by (5.5), we have that
Much of the early literature on the construction of poverty indices focussed on whether indices were decomposable across population subgroups. This has led to the identification of a subgroup of poverty indices known as the "class of decomposable poverty indices". These indices have the property of being expressible as a weighted sum (more generally, as a separable function) of the same poverty indices assessed across population subgroups. They most commonly include the FGT and the Chakravarty classes of indices as well as the Watts index.
Let the population be divided into K mutually exclusive population subgroups, where φ(k) is the share of the population found in subgroup k. For the FGT indices, we then have that:
where P(k; z; α) is the FGT poverty index of subgroup k7. The Watts and Chakravarty indices are expressible as a sum of the poverty indices of each subgroup in exactly the same way as for the FGT indices in (5.11).
E: 18.6
To illustrate the practical implications of the groupdecomposition property, consider the following twogroup (K = 2) example. Let the first group contain 40% of the total population, and let poverty in group 1 be 0.8 and that of group 2 be 0.4. Poverty in the total population is then a simple weighted mean of group poverty, and is immediately computable as 0.4 · 0.8 + 0.6 · 0.4 = 0.56. Estimates of total poverty in a population can then be constructed in a decentralized manner, first by estimating poverty within communities or regions, and then by averaging over these decentralized estimates, without there being a need for all of the micro data to be regrouped in one single register.
7DAD: DecompositionFGT: Decomposition by Groups.
Subgroup decomposability also implies that an income improvement in one of the subgroups will necessarily improve aggregate poverty if the incomes in the other groups have not changed. It will also mean that the optimal design of social safety nets and benefit targeting within any given group can be assessed independently of the income distribution in the other groups: only the distributive characteristics of the relevant group matter for the exercise. If targeting succeeds in decreasing poverty at a local level, then it must also succeed at the aggregate level.
Subgroup decomposability is therefore useful, although it is certainly not imperative for poverty analysis. In particular, it is not because an index facilitates poverty profiling and targeting analysis that this index is necessarily ethically fine. Ease of computation and ethical soundness are also two different an potentially conflicting criteria. Among other things, imposing the decomposability and additivity property can mean sacrificing some important ethical features in the aggregation of poverty. In that context, Ravallion (1994) notes that when measuring poverty "one possible objection to additivity is that it attaches no weight to one aspect of a poverty profile: the inequality between subgroups in the extent of poverty". This can be an important flaw if for instance betweengroup relative deprivation is considered ethically significant.
Expressing poverty indices in the form of EDE poverty gaps enables the decomposition of poverty as a sum of average poverty and inequality in poverty. Let ξg (z) be the EDE poverty gap and g(z) be the cost of inequality in poverty gaps. We then have:
or, alternatively,
For instance, for the popular FGT indices, we have that the cost of inequality in poverty gaps is given by:
When α = 1, we have that the socially representative poverty gap ξg(z) is just the average poverty gap μg(z); inequality in poverty gaps is thus not taken into account in assessing poverty. The poverty cost of inequality is then nil. Since μg(z) is insensitive to α, and since ξg(z; α) is increasing in α, it follows that g(z; α) is also increasing in α; the larger the value of α, the larger the impact of inequality on the level of aggregate poverty. This can be checked on Figure 5.3. We can thus interpret α as a parameter of inequality aversion in measuring poverty. For 0 < α < 1, we have that ξg(z; α) < μg(z), and inequality in poverty is then deemed to reduce poverty: g(z, α) < 0. Ceteris paribus, we then have that the greater the level of inequality, the lower the socially representative level of poverty. For α > 1, we have that g(z; α) > 0 and inequality has therefore a positive poverty cost.
A similar decomposition can be done using (5.1) and the EDE level of censored income. The EDE poverty gap corresponding to that approach is defined as
where *(z; ρ, ε) = μ*(z) · I*(z; ρ, ε) is the cost of inequality in censored income and where I*(z; ρ.ε) is the index of inequality in censored income.
It is often informative to portray the whole distribution of poverty gaps on a simple graph, in a way which shows both the incidence and the inequality of income deprivation. Particularly useful is the poverty gap curve, which plots g(p; z) as a function of p — see again Figure 5.3. The curve naturally decreases with the rank p in the population, and reaches zero at the value of p equal to the headcount. The integral under the curve gives the average poverty gap, and its steepness indicates the degree of inequality in the distribution of poverty gaps.
Another percentilebased curve that is graphically informative and that is useful for the measurement and comparison of poverty is called the Cumulative Poverty Gap (CPG) curve (also sometimes referred to as the inverse Generalized Lorenz curve, the "TIP" curve, or the poverty profile curve). The CPG curve cumulates the poverty gaps of the bottom p proportion of the population. It is defined as:8
E:18.7.8
A CPG curve is drawn on Figure 5.4. The slope of G(p; z) at a given value of p shows the poverty gap g(p; z). Since g(p; z) is nonnegative, G(p; z) is nondecreasing. G(p = 1; z) equals the average poverty gap μg(z). The percentile at which G(p; z) becomes horizontal (where g(p; z) becomes zero) yields the poverty headcount. Furthermore, since the higher his rank p in the population, the richer is an individual, and therefore the lower is his poverty gap, G(p; z) is therefore concave in p. Because of this, the CPG curve exhibits
8DAD: CurvesCPG.
for poverty analysis the same descriptive interest as the Lorenz and Generalized Lorenz curves for the analysis of inequality and social welfare. The distance of G(p; z) from the line of perfect equality of poverty gaps (namely, the line 0B in Figure 5.4) shows the inequality of poverty gaps among the total population. The distance of G(p; z) from the line of perfect equality of poverty gaps among the poor (namely, the line 0A in Figure 5.4) displays the inequality of poverty gaps among the poor. Finally, the concavity of G(p; z) is inversely related to the density of poverty gaps at p.
When weighted by K(p; ρ), the area underneath the CPG curve generates the class of SGini poverty indices9:
Recall that K(p; ρ) = ρ(ρ–1) (1 – p)ρ–2 · P(z; ρ = 1) thus equals the average poverty gap, μg(z), P(z; ρ = 2) is the poverty index that is analogous to the standard Gini index of inequality, and the wellknown Sen index of poverty is given by:
An interesting feature of the P(z; ρ) indices is their link with absolute and relative deprivation. Let absolute deprivation, AD(z), be given by the average shortfall from the poverty line, that is, by μg(z). Recalling (4.25) and (4.26), we can define relative deprivation in censored income at percentile p as:
Average relative deprivation across the whole population is then:
It is then possible to show that:
The larger the value of ρ, the larger is relative deprivation, RD(z; ρ), and the larger are P(z;ρ) and the contribution of relative deprivation and inequality to poverty. This provides an alternative link between inequality and poverty.
9DAD: PovertySGini Index.
Most of the poverty indices discussed above have initially been introduced in the literature in a normalized form, that is, by dividing censored income and poverty gaps by the poverty line. The FGT indices, for instance, are generally expressed as10:
(see (5.6)). Normalizing poverty indices will make no substantial difference and little expositional difference for poverty analysis when the distributions of income being compared have identical poverty lines. This will typically be the case, for instance, when incomes are expressed in real (or constant) values, and when the focus is on absolute poverty with constant real poverty lines. Normalizing poverty indices by the poverty line will
make the EDE poverty gap lie between 0 and 1,
make poverty indices insensitive to and independent of the monetary units (e.g., dollars or cents) used in assessing income, and
make the indices invariant to an equiproportionate change in all incomes and in the poverty line.
Normalizing poverty indices is particularly useful if the poverty lines serve as price indices, and thus used to enable comparisons of nominal income across time and space (recall that price indices are used to convert nominal incomes into baseyear real incomes).
Normalized poverty indices are usually referred to as "relative poverty indices"; changing all incomes and the poverty line by the same proportion will not affect the value of relative poverty indices. FGT and other poverty gap indices that are not normalized are often called "absolute" poverty indices; it can be checked that equal absolute additions to all incomes and to the poverty line will not affect their value. Increasing all incomes and the poverty line by the same proportion will, however, increase the value of such absolute poverty indices.
When poverty lines are different across distributions, and when their ratio across time or space cannot be interpreted simply as a ratio of price indices, the normalization of poverty indices by these poverty lines can, however, be problematic, and is surely open to debate. This is the case, for instance, when we are interested in comparing the absolute shortfalls of "real" income from a "real" poverty line, when these real poverty lines vary across populations
10DAD: PovertyFGT Index.
or population subgroups. Examples can arise, inter alia, in comparing the poverty of families of different sizes and composition, or in comparing poverty across distributions with different social or cultural bases for the definition of a poverty line.
To see this more clearly, consider the following example in which all incomes and poverty lines are expressed in real terms (namely, they have been adjusted for differences in the cost of living, and they are therefore comparable). In country A, the poverty line is $1,000, and a poor person i has an income of $500. Because, say, of cultural and/or sociological differences (these differences may exist across time or space), the poverty line in country B is larger and is equal to $2,000, and a poor person j in it has an income equal to $1,100. Who of i and j is poorer? If we adopt the relative view to building poverty indices, i will be considered the poorer since as a proportion of the respective poverty lines he is farther away from it than j. If, instead, absolute poverty indices are used, j will be deemed the poorer since his absolute poverty gap ($900) is by far larger than that of i ($500). Which of these two views should prevail is then open to debate.
It is often useful to determine whether it is meanincome growth or changes in the relative income shares accruing to different parts of the population that are responsible for the evolution of poverty across time. Investigating this can also help assess whether these two factors, meanincome changes and inequality changes, work in the same or in opposite directions when it comes to the behavior of aggregate poverty. Similarly, we may wish to assess whether differences in poverty across countries or regions are due to differences in inequality or to differences in mean levels of income.
There are several ways to do this. To illustrate them, assume that we wish to compare distributions A and B to determine if it is the difference in their mean income (" growth") or the difference in their income inequality (" redistribution") that accounts for their difference in poverty. The common feature of all existing growth redistribution decomposition procedures is
1 first, to scale the two distributions A and B such that they have the same mean, and interpret the difference in poverty across these two scaled distributions as the impact on poverty of their difference in inequality;
2 and second, to interpret the difference in poverty between one of the distributions (say, A) and that same distribution scaled to the mean income of the other distribution (B) as the impact on poverty of their difference in mean income.
Starting from this, the precise growth redistribution decomposition procedures that are chosen differ by the solution they apply to a basic problem known generally in the nationalaccounts literature as the "index problem". Specifically here, should we scale A to the mean of B, or B to the mean of A, to assess the impact of differences in inequality? And, in estimating the impact of differences in mean incomes, should we compare A with AscaledtothemeanofB, or B with BscaledtothemeanofA?
The first paper that implemented a growth redistribution decomposition of poverty differences (Datt and Ravallion 1992) used the initial distribution as a reference "anchor point". To see how, it is easiest to use the normalized FGT indices (z; a) defined in (5.6), although the growth redistribution decomposition methodologies can be used with any relative poverty indices, additive or not. The change in poverty between A and B is expressed as a sum of a "growth" (difference in mean income) effect and of a "redistributive" (difference in relative income shares) effect, plus an error term that originates from the abovementioned index problem. This gives11:
The first expression in the first term on the left of (5.23), A, is poverty in A after A's incomes have been scaled by μB/μA to yield a distribution with mean μB and inequality unchanged. is thus the difference between two distributions with the same relative income shares but with (possibly) different mean incomes. When μB > μA, this growth term is negative — this simply says that growth reduces poverty. The first expression in the second term, B , is poverty in B after B's incomes have been scaled by μA/μB to yield a distribution with mean μA. is thus the difference between two distributions with identical mean incomes but with (possibly) different inequality. When the Lorenz curve for B is everywhere above the Lorenz curve for A, this redistribution term is necessarily negative when α ≥ 1, but it can also be positive when α < 1.
The error term in (5.23) can be expressed as:
11DAD: DecompositionFGT: Growth & Redistribution.
This error term can be shown to be either the difference between the growth effect measured using B as a reference distribution and that using A as the reference distribution,
or the difference between the redistribution effect measured using B as the reference distribution and the redistribution effect using A as the reference distribution,
An alternative decomposition uses the posterior distribution B as the reference distribution for assessing the growth and redistribution effects. This yields:
Clearly, a middle way between these two alternative decomposition procedures is to measure the growth effect as the average of the two growth effects, in (5.23) and (5.27), and likewise to measure the redistribution effect as the average of the two redistribution effects. Proceeding this way has the advantage of eliminating the error term in the poverty decomposition, since the error terms of each of the two alternative decompositions sum to zero. This middle way is in fact what would be given by the use of the Shapley value to perform a growth redistribution decomposition — see the Appendix 4.7 for more details on the Shapley value. This leads to the following growth redistribution decomposition12:
Equation (5.11) shows how poverty can be expressed as a sum of the poverty contributions of the various subgroups that make a population. Each subgroup contributes in proportion to its share in the population and to the level of poverty found in that subgroup. Hence, we may wish to express changes in poverty across time or space as a function of differences in these factors. More precisely, we want to see whether differences in poverty across distributions can be attributed to differences in demographic or sectoral composition across these distributions, or to differences in poverty across these demographic or sectoral groups. We may express this as follows13:
Note that the decomposition in (5.29) suffers from the same index number problem as the earlier one in (5.23). For example, one could prefer to use φB(k) instead of φA(k) to compute the withingroup poverty effects. It may
12DAD: DecompositionFGT: Growth & Redistribution.
13DAD: DecompositionFGT: Sectoral.
also seem more convenient to weight the withingroup poverty effects by the average population shares, and to weight the demographic and sectoral effects by the average poverty index. This yields 14:
where (k)= 0.5 (φA (k) + φB (k)) and (k; z; α) = 0.5 (A(k;z;α) +B(k;z;α)). Note from (5.30) that this decomposition procedure removes the error term. Depending on the context, the decomposition in (5.30) could serve to show, for instance, how variations in the size and in the poverty of various sectors of the economy account for variations of total poverty across economies, how differences in the size and in the poverty of various demographic groups explain differences in total poverty across societies, how migration and differential poverty across regions account for changes in poverty across time, etc..
An alternative use of the decomposition in (5.11) computes the impact of a change in the proportion of the population that is found in a group k, this change being accompanied by an exactly offsetting change in the proportion of the other groups. This may be useful, for instance, if one wishes to predict the impact of migration or demographic changes on national poverty, keeping out withingroup poverty. Let the population share of a group t, φ(t), increase by a proportion λ to φ(t)(1 + λ), with a proportional fall in the other groups' population share from φ(k) to φ(k) (1 – φ(t)λ/(1 – φ(t))). Note that the new population shares will add up to 1 since
The net impact of this on poverty is then15
14DAD: DecompositionFGT: Sectoral.
15DAD: PovertyImpact of Demographic Change.
We may instead wish to predict the impact of an absolute increase in the population share of a group t. Let this change be from φ(t) to φ(t) + λ, with a corresponding fall in the other groups' population share that is proportional to their initial share (a fall from φ(k) to φ(k) (1 – λ/(1 – φ(t)))). The resulting change in poverty is analogously given as
Note that the only difference between (5.31) and (5.32) comes from the size in the increase in φ(t), which is φ(t)λ in (5.31) and λ in (5.32).
Let C income components add up to total income X(p), with X(p) = and being the expected value of income component c at rank p in the distribution of total income. can be, for instance, agricultural or capital income, or the income of those living in some geographic area, or some type of expenditure that enters total expenditure X 16.
We may wish to know by what amount total poverty is reduced by the presence of an income component. Clearly, we would expect those components with a large mean μX(c) to be more effective in helping to alleviate total poverty. But we must also take into account the distribution of . Suppose for instance that urban capital income is larger than rural capital income, but that poverty is low in urban areas because urban labor income is large there. Then, it is unclear whether relatively high capital income in urban areas is more effective at alleviating poverty than the relatively low capital income in rural areas, where poverty is more concentrated.
The contribution of an income component c to poverty alleviation can be given by the fall in poverty after is added to initial income. But this fall depends on what this initial income is. Does it include some of the other income components? This pathdependency difficulty can again be circumvented by the use of the Shapley value. We start by assuming maximum poverty, that is, poverty when total income is nil for everyone. We then estimate the contribution of component c to poverty alleviation as the expected value of its marginal contribution when it is added to anyone of the various subsets of income components that one can choose from the set of all the components.
16DAD: DecompositionFGT: Decomposition by Sources.
When a component is missing from that set for an individual, we assume that its value is 0.
Rowntree (1901) predated by far the modern quantitative approach to poverty measurement. General and recent references include Chen and Ravallion (2001) (for wide empirical evidence on poverty), Constance and Michael (1995) (for the US debate on poverty measurement), Deaton (2001) (for the empirical difficulties associated with "counting the poor"), Glewwe (2001) (for a very extensive coverage of the nature, evolution, and causes of poverty), Jantti and Danzinger (2000) (for poverty in more advanced countries), Lipton and Ravallion (1995) (for poverty and policy), Ravallion (1994) and Ravallion (1996) (for a nontechnical overview and discussion of poverty measurement issues), Smeeding, Rainwater, and O'Higgins (1990) (for early results using Luxembourg Income Study data) and Zheng (1997) (for a review of poverty indices).
The papers by Watts (1968), Sen (1976) and Foster, Greer, and Thorbecke (1984) influenced greatly much of the subsequently large literature on poverty indices. Relatively early contributions on poverty measurement are found in Anand (1977), Blackorby and Donaldson (1980), Chakravarty (1983a), Chakravarty (1983b), Clark, Hamming, and Ulph (1981), Donaldson and Weymark (1986), Foster (1984), Hagenaars (1987), Kakwani (1980), Kundu and Smith (1983), Takayama (1979), and Thon (1979). More recent works include Chakravarty (1997), Myles and Picot (2000), Osberg and Xu (2000) and Shorrocks (1995) on a revisited and improved form of the Sen (1976) poverty index; Duclos and Gregoire (2002) on the link between linear poverty indices and relative deprivation; Morduch (1998) and Zheng (1993) on the Watts index; Pattanaik and Sengupta (1995) on the original Sen index; and Shorrocks (1998) on "deprivation profiles".
Applied poverty studies using these developments have been almost innumerable. A small subset of the studies that have been published includes Coulombe and McKay (1998) (Mauritania), Coulombe and McKay (1998) (Ghana), Davidson and Duclos (2000) (using LIS data), Gustafsson and Nivorozhkina (1996) (Northern countries), Grootart and Kanbur (1995) (Côte d'Ivoire), Gustafsson and Shi (2002) (China), Hagenaars and De Vos (1988) (the Netherlands), Hill and Michael (2001) (US), Iceland, Short, Garner, and Johnson (2001) (US), Milanovic (1992) (Poland), Osberg and Xu (1999) (Canada), Osberg (2000) (Canada and the US), Pendakur (2001) (Canada), Rady (2000) (Egypt), Ravallion and Bidani (1994) (Indonesia), Ravallion and Chen (1997) (67 less developed countries), Rodgers and Rodgers (2000) (Australia), and Szulc (1995) (Poland).
The empirical links between growth, poverty and inequality have also often been analyzed in recent years. Studies on whether growth is beneficial to the poor, both absolutely and relatively speaking, include Bigsten and Shimeles (2003) (for Ethiopian evidence), Datt and Ravallion (2002) (for a survey of the Indian evidence), Dollar and Kraay (2002) (for an influential study of the experience of 42 countries over 4 decades), Essama Nssah (1997) (for Madagascar evidence), Ravallion and Chen (1997) (where growth is found to decrease inequality as often as it increases it), Ravallion (2001) (where a warning against the use of crosscountry regressions is made), and Ravallion and Datt (2002) (for differential evidence across Indian states). De Janvry and Sadoulet (2000), Deininger and Squire (1998) and Ravallion (1998a) also apply causal tests to determine whether inequality favors or impedes growth. See also Ravallion and Chen (2003) and Tsui (1996) for the use of the average poverty gap and the Watts index as indices of whether growth is beneficial to the poor.
Three major issues arise in the estimation and in the use of poverty lines. First, we must define the space in which wellbeing is to be measured. As discussed in Chapter 1, this can be the space of utility, incomes, "basic needs", functionings, or capabilities. Second, we must determine whether we are interested in an absolute or in a relative poverty line in the space considered. Third, we must choose whether it is by someone's "capacity to function" or by someone's "actual functioning" that we will judge if that person is poor. We consider first the issue of the choice between an absolute and a relative poverty line.
An absolute poverty line can be interpreted as fixed in any one of the spaces in which we wish to assess wellbeing. Conversely, a relative poverty line would depend on the distribution of wellbeing (including the utilities, living standards, functionings or capabilities) found in a society and would therefore vary across societies. Considerable controversy exists on whether absoluteness or relativity is a better property for a poverty threshold. Most analysts would probably agree that a poverty threshold defined in the space of functionings and capabilities should be absolute (but even on this there is no unanimity). An absolute threshold in these spaces would, however, generally imply relativity of the corresponding thresholds in the space of the commodities and in the level of basic needs required to achieve these functionings.
There are two main reasons for this. First, the relative prices and the availability of commodities depend on the distribution of incomes. For instance, as a society initially develops, rising numbers of people need to travel to work and to trade, without first being able to afford the costs of private transportation. Because of increasing returns to scale in the provision of public transportation, the affordability and accessibility of public transportation usually also first increases during that development stage. As societies become richer on average, however, their citizens start making increasing use of private forms of transportation, a phenomenon which causes a fall in the supply and availability of public transportation, leading to an increase in its relative price. This makes the capacity to travel (arguably an important capacity) more or less costly, depending on the state of economic development.
Second, not to be deprived of some capability may require the absence of relative deprivation in the space of some commodities. In support of this, there is Adam Smith's famous statement that the commodities needed to go without shame (an oftmentioned basic functioning) can be to some extent relative to the distribution of such commodities in a society:
By necessaries I understand not only the commodities which are indispensably necessary for the support of life, but whatever the custom of the country renders it indecent for creditable people, even of the lowest order, to be without. A linen shirt, for example, is, strictly speaking, not a necessary of life. The Greeks and Romans lived, I suppose, very comfortably though they had no linen. But in the present times, through the greater part of Europe, a creditable daylaborer would be ashamed to appear in public without a linen shirt, the want of which would be supposed to denote that disgraceful degree of poverty which, it is presumed, nobody can well fall into without extreme bad conduct. Custom, in the same manner, has rendered leather shoes a necessary of life in England. The poorest creditable person of either sex would be ashamed to appear in public without them. In Scotland, custom has rendered them a necessary of life to the lowest order of men; but not to the same order of women, who may, without any discredit, walk about barefooted. In France they are necessaries neither to men nor to women, the lowest rank of both sexes appearing there publicly, without any discredit, sometimes in wooden shoes, and sometimes barefooted. Under necessaries, therefore, I comprehend not only those things which nature, but those things which the established rules of decency have rendered necessary to the lowest rank of people. (Smith 1776, Book 5, Chapter 2)
Sen (1985), reinforces this by distinguishing clearly the two dimensions of capabilities and commodities:
I would like to say that poverty is an absolute notion in the space of capabilities but very often it will take a relative form in the space of commodities and characteristics (Sen 1985, p.335).
This view is in fact also consistent with the World Bank's influential definition of poverty, which says that poverty is the inability to attain a minimal standard of living (World Bank 1990). This minimal standard consists of
of nutrition and other basic necessities and a further amount that varies from country to country, reflecting the cost of participating in the everyday life of society. (World Bank 1990, p. 26)
This has led some writers (particularly in developed countries) to conclude that attempts to preserve some degree of absoluteness in the space of commodities are untenable:
In summary, it does not seem possible to develop an approach to poverty measurement which is linked to absolute standards. While some analysts are uneasy with relativist concepts of poverty on the grounds that they are difficult to comprehend and can be seen as somewhat arbitrary and open to manipulation, no real practical alternative to relativist concepts exists. (Saunders 1994, p. 227)
Complete relativity of the poverty line in the space of commodities would nevertheless draw poverty analysis very close to the analysis of social exclusion (as exemplified by Rodgers, Gore, and Figueiredo 1995 at the International Labor Organization) and relative deprivation (as propounded for instance by Townsend 1979). Social exclusion entails "the drawing of inappropriate group distinctions between free and equal individuals which deny access to or participation in exchange or interaction" (Silver 1994, p.557). This includes participation in property, earnings, public goods, and in the prevailing consumption level (Silver 1994, p.541). Relative deprivation focuses on the inability to enjoy living standards and activities that are ordinarily observed in a society. Townsend (1979) defines it as a situation in which
Individuals, families and groups in the population (...) lack the resources to obtain the types of diet, participate in the activities and have the living conditions and amenities which are customary or at least widely encouraged or approved, in the society to which they belong, (p.30)
Equating absolute deprivation in the space of capabilities with relative deprivation in the space of commodities can, however, be a source of confusion in poverty comparisons. First, it tends to blur the operational and conceptual distinction between poverty and inequality. Second, it can hinder the identification of "core" or absolute poverty in any of the spaces. The identification of core poverty is, indeed, probably the most important input into the design of public policy in developing countries. Third, although the ethical appeal of Sen's capability approach has variously been invoked to justify the use of an entirely relative poverty line in the space of commodities, Sen himself does not accept this:
Indeed, there is an irreducible core of absolute deprivation in our idea of poverty, which translates reports of starvation, malnutrition and visible hardship into a diagnosis of poverty without having to ascertain first the relative picture. Thus the approach of relative deprivation supplements rather than supplants the analysis of poverty in terms of absolute dispossession (Sen 1981, p. 17)).
Furthermore,
(...) considerations of relative deprivation are relevant in specifying the 'basic' needs, but attempts to make relative deprivation the sole basis of such specification is doomed to failure since there is an irreducible core of absolute deprivation in the concept of poverty (Sen 1981, p.17).
Given the measurement difficulties involved in estimating relative poverty lines that correspond to absolute poverty lines in the space of functionings and capabilities, analysts often find most transparent to use the space of living standards as the space in which to define an absolute threshold. If this is done, however, it must subsequently be admitted that the procedure will imply a set of thresholds in the space of functionings and capabilities that depend at least partly on the conditions of the society in which an individual lives. Indeed, for a given absolute level of living standard in the space of commodities, an individual's capabilities are generally relative, that is, they depend on his social and economic environment, at least for functionings such as shamelessness and participation in the life of the community.
Methodologies for the estimation of poverty lines have been most developed in the context of the fulfillment of basic physiological needs. Although such methodologies have often been set in a welfarist framework, they also matter for the basic needs, functioning or capability approaches since these approaches are also concerned with basic physiological achievements. These methodologies have recently been most often applied to developing country contexts.
The estimation of the "cost of basic needs" (CBN) usually involves two steps. First, an estimation is made of the minimal food expenditures that are necessary for living in good health; we will denote this by zF. Second, an analogous estimate of the required nonfood expenditures, ZNF, is computed and added to zF to yield a total poverty line, ZT We consider now in some detail each of these two steps.
The first step in the computation of a global poverty line is usually to estimate a food poverty line. The determination of a food poverty line generally proceeds by asking what amount of food expenditures is required to achieve some minimal required level of foodenergy intake (or nutrient intake, such as proteins, vitamins, fat, or minerals. Early examples of the application of this approach include Rowntree (1901) and Orshansky (1965). A basket of food commodities is designed or estimated by "food specialists" such as to provide those minimally required levels of foodenergy intake. The cost of that basket yields the food poverty line zF.
To illustrate how this exercise can be carried out in practice, consider Figure 6.1, which plots consumption x1(p) and x2(p) of two goods, goods 1 and 2, over a range of percentiles p. For simplicity, Figure 6.1 supposes that good 1 is "incomeinelastic" (x1(p) is constant) but that the consumption of good 2 increases with the rank in the distribution of income (it is income elastic). The idea then is to select a combination of x1(p) and x2(p) that provides a given level of minimal calorie intake. For the purposes of this illustration, assume that this minimum energy intake is 3000 calories per day, and that 1 unit of good 1 and 2 provides 2000 and 1000 calories each respectively. Also assume that each unit of good 1 and 2 costs q$.
The cheapest way to achieve the minimum calorie intake would be to consume only of good 1, since good 1 is the most calorieefficient (we can think of good 1 as "cereals" and good 2 as "meat"). Indeed, each calorie provided by the consumption of good 1 costs q$/2000, whereas each calorie provided by the consumption of good 2 costs twice as much, that is, q$/1000. 1.5 units of good 1 (1.5 units *2000 calories/unit =3000 calories) would then be required for the minimal energy intake to be met, and zF would then equal 1.5q$.
This, however, would suppose a food commodity basket that no individual in Figure 6.1 would be observed to consume. Even at the very bottom of the distribution of income, individuals consume indeed at least some of good 2 at the expense of a diminished consumption of the more calorieefficient good 1. We should presumably take this information into account if we wished to respect at least to some extent the cultural and culinary preferences of those whose wellbeing we aim to evaluate. This raises the obvious question of which preferences we should consider. Note that the preferred ratio of good 2 over good 1 increases continuously with p in Figure 6.1. For convenience, denote that ratio by ρ(p) = x2(p)/x1(p). Simple algebra then shows that the cost of attaining the minimum calorie intake is given by zF(p) = 3q$(1 + ρ(p))/(2 + ρ(p)), where zF(p) indicates that zF depends on the rank p of those whose preferences we use to build the commodity basket and to compute the food poverty line.
Figure 6.1 plots zF(p) and shows that it is not neutral to the choice of p. Using the preferences of the poorest, we obtain zF(p = 0) = 1.8q$, but if we use the preferences of the median population, we get zF(p = 0.5) = 2.1q$. This is in fact just one example of a more general standard observation in the literature on poverty lines that the choice of reference parameters matters for the estimation of poverty lines. In Figure 6.1, the farther are the preferences ρ(p) from the most calorieefficient choice, the more costly is the estimated food poverty line zF(p). Arguably, the preferences ρ(p) should be those of the individuals that are close to the total poverty line, but this is a (partly) circular argument since ρ(p) is itself a determinant of that total poverty line. In practice, an arbitrary value of p is often chosen, reflecting some a priori belief on the position of those at the edge of the total poverty line. A more common (though arguably less commendable) procedure is to compute and
use an average value of x2(p)/x1(p) over a range of p, such as the bottom 25% or 50% individuals of a population.
Even if we were to agree on the position p at which we wish to observe preferences such as ρ(p), there still remains the awkward fact that preferences will often vary significantly even at this given value of p. Said differently, there are in practice many different actual consumption patterns for a group of "typical poor". One solution is simply to ignore these differences and estimate the typical poor's average consumption patterns. Following this line of argument, consumption expenditures on various food items are regressed against income and the estimated parameters of these regressions are then used to predict the consumption patterns of the "typical poor". These regressions have often been parametric — assuming for instance that expenditures on cereals and meat are globally quadratic or loglinear in total expenditures. It is unlikely, however, that such parametric forms fit appropriately at all income levels, low and high alike. A better statistical procedure would probably be to regress consumption expenditures non parametrically on total expenditures, which would allow for a better fit of the preferences of those around the "typical poor".
An additionally important issue then is whether variations in culinary tastes and food habits across socioeconomic characteristics should be taken into account. If no account of such variations are taken, then we can choose as a reference group that group whose diet minimizes food cost while providing the minimum required level of foodenergy intake. This would typically generate an unreasonably low level of expenditures for many other groups, with an implied dietary basket of food commodities that could again be very different from those they typically consume.
If, however, full account of diversity in culinary tastes were to be taken, a serious risk would exist of overestimating the poverty lines of those individuals and groups of individuals with a greater taste for expensive foods (e.g., of higher quality or better taste). This is commonly the case, for instance, for urban households, who customarily have more sophisticated culinary tastes than rural dwellers (for the same overall living standards), and have also greater access to a larger variety of imported and expensive foods. This procedure would then assign greater poverty lines to urban versus rural individuals. It would also mean that the utility equivalents of individual food poverty lines would depend on the peculiarities of the individuals' food preferences. This would generally lead to inconsistent comparisons of wellbeing across urban and rural inhabitants, and would exaggerate the degree of poverty in the urban as compared to the rural areas.
We can illustrate this using Figure 6.2. Figure 6.2 shows baskets of two food commodities, x1 and x2, with three food budget constraints of total food consumption equal to Y0, Yl, and Y2 (these total budgets are expressed in units of x1). Figure 6.2 also shows a "minimum calorie constraint", along which the total calories provided by the consumption of x1 and x2 equal the required minimum level of calorie intake. If no account whatsoever were taken of preferences, Y0 would yield the food poverty line. But along the food budget constraint Y0, there is only one point which meets the minimum calorie constraint (the point at which x1 = Y0 and x2 = 0, and it is of course unlikely that individuals will choose a food basket to be precisely at that corner. An individual with preferences U0 and budget Y0, for instance, would not locate himself on the minimum calorie constraint. It is only with the more generous budget constraint Y1 that this individual will consume the minimally required level of calorie intake, as shown on Figure 6.2.
But not all individuals will necessarily choose to be "caloriesufficient" even with a total food budget of Y1. Individuals with greater preferences — as in the case of U2 — for the lesscalorie efficient good x2 will not choose a food basket on or above the minimum calorie constraint. Individual with preferences U2 will instead need Y2 to be caloriesufficient. Yet, whether individuals with preferences U1 and budget Y1 are just as well off as individuals with preferences U2 and budget Y2 is debatable. Such would be the assumption, however, if we used two distinct poverty lines Y1 and Y2 for the two different tastes.
As mentioned above, such comparability assumptions are often implicitly made in practice when individuals living in different regions, rural or urban for instance, are assigned different poverty lines for reasons independent of differences in needs or prices. As illustrated in Figure 6.2, this supposes that an individual with "sophisticated" preferences (an urban dweller who has been accustomed to food variety) needs a higher budget to be as "well off" as an individual with less expensive preferences (a rural dweller who is content with eating basic food types). Probably more convincing, however, would be the view that U2 with Y2 in Figure 6.2 provides greater utility and wellbeing than U1 with Y1. Assigning different poverty lines Y1 and Y2 would then lead to inconsistent and biased poverty estimates.
Minimally required food expenditures can also be (and are often) adjusted for differences in climate, sex, or age, when such differences impact on needs rather than on tastes (as we discussed above). These expenditures can also be adjusted for variations in activity levels, although activity levels depend on the level of one's wellbeing, and thus on one's poverty status. Activitylevel adjustments would thus generate a poverty line that evolves endogenously with the standard of living of individuals, a slightly awkward feature for comparing poverty.
The subsequent step is usually to estimate the nonfood component of the total poverty line. The most popular method for doing this is simply to go
straight to an estimate of the total poverty line by dividing the food poverty line by the share of food in total expenditures. The intuition behind this is as follows. The larger the food share in total expenditures, the closer the food poverty line should be to the total poverty line. Therefore, the smaller should be the necessary adjustment to the food poverty line (the closer to 1 should be the denominator that divides the food poverty line). Indeed, dividing ZF by ZF/ZT (the food share) gives ZT. The problem of which food share to use is of course an important issue. It is a problem analogous to the one discussed above on what the food basket should be for computing a food poverty line. Popular practices vary, but often make use of:
A the average food share of those whose total expenditures equal the food poverty line;
E: 18.4.5
B the average food share of those whose food expenditures equal the food poverty line;
E: 18.4.3
C the average food share of a bottom proportion of the population (e.g., the 25% or 50% poorest).
In addition to this, another popular method
D adds to zF the nonfood expenditures of those whose total expenditures equal ZF
E: 18.4.7
To see how methods A, B and D work and differ from each other, consider Figure 6.3. Figure 6.3 shows (predicted) total expenditures against various levels of food expenditures. The regression can be done parametrically, but a generally better approach would be to predict total expenditures using a nonparametric regression on food expenditures.1 On each of the two axes is shown the level of the (previously estimated) food poverty line zF. These two levels meet at the 45 degree line.
As indicated above, method A makes use of the average food share of those whose total expenditures equal the food poverty line. Total expenditures equal the food poverty line, zF, at point E on Figure 6.3. The food share at point E is given by the inverse of the slope of the line OE that goes from the origin to point E. The total poverty line according to method A is therefore given by the height of a line OE that extends to just above a level of food expenditures zF. This gives the vertical height of point A as the total poverty line according to method A.
Method B makes use of the average food share of those whose food expenditures equal the food poverty line. Those who consume zF in food are located
1 DAD: DistributionNonParametric Regression.
at point B on Figure 6.3. Their food share is given by the inverse of the slope of the straight line that would extend from point O to point B. Hence, dividing zF by that food share brings us back to point B, which is therefore the total poverty line according to method B.
The total poverty line according to method B is more generous than that according to method A since the food share used for B is lower than that used for A. Indeed, method A focusses on the food share of a rather deprived population: those who, in total, only spend the food poverty line. Method B focusses on the food share of a less deprived population: those who, on food only, spend the food poverty line. Since food shares tend to decline with standards of living, method B's food share is usually lower than method A's.
Finally, method D considers the nonfood expenditures of those whose total expenditures equal zF. As for method A, these individuals are found at point E on Figure 6.3. Their nonfood expenditures are given by the length of line EG on the Figure. Adding these nonfood expenditures to zF yields a total poverty line given by the height of point D.
The choice of methods and food shares and the estimation of the nonfood poverty lines is rather arbitrary, and the resulting estimate of the total poverty line will also be somewhat arbitrary. Moreover, and perhaps more worryingly, some of the estimates will also vary with the distribution of living standards, as in the case of method C where the food share is an average over a range of individuals. To avoid inconsistencies in poverty comparisons, it would therefore seem preferable to use the same food share across the distributions being compared, and to use methods that do not make estimates overly dependent on a particular distribution of living standards.
A slightly different method for estimating poverty lines that is popular in the literature is the socalled FoodEnergyIntake (FEI) method. Estimates of observed calorie intakes are first computed and then graphed against observed (total or food) expenditures. The analyst then estimates the expenditures of those whose calorie intake is just at the minimum required for healthy subsistence. When these expenditures are on food, this provides a food poverty line, which can then be used as described above in Section 6.3.3 to provide an estimate of a global poverty line. When the expenditures are total expenditures, the FEI method provides a direct link between a minimum calorie intake and a total poverty line2.
E:18.4.1
Figure 6.4 illustrates how this method works. The curve shows the level of expenditure (measured on the vertical axis) that is observed (on average) at
2DAD: DistributionNonParametric Regression.
a given level of calorie intake (shown on the horizontal axis). The curve is increasing and convex, since calorie intake is usually expected to increase at a diminishing rate with food or total expenditures. Above zk, the minimum calorie intake recommended for a healthy life, we read z, the food or total poverty line according to the FEI method.
As just exposed, the FBI method may appear straightforward and simple to implement. A number of conceptual and measurement problems are hidden, however, behind this apparent simplicity. Note for instance that the line traced on Figure 6.4 is the expected link between expenditure and calorie intake; there is in real life a significant amount of variability around this line. How are we to interpret this variability? If it is due to measurement errors, then we may perhaps ignore it. If it is due to variability in preferences, then we may wish to model the calorieintakeexpenditure relationship separately for different groups of the population, as is often done in practice, for urban and rural areas for instance. As in the costofbasicneeds method, however, we then run the risk of estimating higher poverty lines for those groups that have more expensive or more sophisticated tastes for food. This would lead to inconsistent comparisons of wellbeing and poverty, as discussed in Section 6.3.2.
To compute expected expenditure (given the variability of actual observed spending) at a given calorie intake, we can estimate the parameters of a parametric regression linking expenditures to calorie intake. Again, the regression is often postulated to be loglinear or quadratic. This parametric specification supposes, however, that the functional relationship between expenditures and calorie intake is known by the analyst, up to some unknown parameter values. This is unlikely to be true everywhere, especially for those far from the level of calorie intake of interest (e.g., those at the lower and upper tails of the distribution of spending and calorie intake). In such cases, the parametric procedure will make the estimated expenditure poverty line affected by the presence of "outliers" that are relatively far from the minimum level of calorie intake. This procedure will then generate a biased estimator of the "true" poverty line. A more flexible and arguably better approach would be to estimate the link between expenditures and calorie intake non parametrically.
To see whether differences in some of the methodologies described above can matter, consider the case of 1996 Cameroon. Table 6.1 shows the result of estimating food, nonfood and total poverty lines for the whole of Cameroon and for each of its 6 regions separately. Note that the figures are in Francs CFA adjusted for price differences, with Yaoundé being the reference region. The food poverty line was estimated using the FEI method at 2400 calories per day per adult equivalent. A nonparametric regression using DAD was performed for the whole of Cameroon and separately for each of the 6 regions. The lower nonfood poverty line was obtained (non parametrically) using method D in section 6.3.3, and the upper nonfood poverty line using method B. Again, the relevant regressions were carried out for the whole of Cameroon and separately for each of its 6 regions.
As can be seen, the link between calorie intake and food expenditures varies systematically across regions. Expected food expenditure at 2400 calories per day is significantly higher in urban areas (Yaoundé, Douala and Other cities) than in the rural ones. In Douala, for instance, a household would need 408 Francs CFA per day per adult equivalent to reach an intake of 2400 calories per day. In the Highlands, no more than 170 Francs CFA would on average be needed. The link between food and total expenditures also varies across Cameroon's regions. Combined with the different estimates for the food poverty lines, this leads to very significant variations across regions in the total poverty lines. Using method D, a lower total poverty line of 589 Francs CFA is obtained for Douala, but that same poverty line is only 235 Francs CFA for the Highlands. Note also that the choice of method B vs method D has a very significant impact on the estimate of the total poverty line. For the whole of Cameroon, the lower and the upper total poverty lines are respectively 373 and 534 Francs CFA, a difference of 43%.
Unsurprisingly, these large differences across regions and across methods have a large impact on national poverty estimates and on regional poverty comparisons. This is illustrated in Table 6.2, which shows the proportion of individuals underneath various poverty lines for various indicators of wellbeing. "Calorie poverty" (first line) is relatively constant across Cameroon. In the whole of Cameroon, 68.1% of the population was observed to consume less than 2400 calories per day per adult equivalent. This proportion varies between 59.9% (for Other cities) and 86.5% (for Forests) across regions. Roughly the same limited variability and the same poverty rankings appear when food poverty is estimated using for each region its own food poverty line (third line). However, when a common food poverty line is used to assess food poverty in each region (second line), national poverty stays roughly unchanged at around 69% but urban regions now appear significantly less poor than the rural ones. For instance, the poverty headcount in Douala (42.0%) is now only half that of the Highlands (82.5%).
The rest of Table 6.2 confirms these lessons. When a common poverty line is used to compare the regions, rural areas are significantly poorer than urban ones. When regionspecific poverty lines are used, these differences are much reduced, and the regional rankings are often even reversed. For example, using a common lower total poverty line (fourth line), the Highlands have a headcount ratio more than three times that of the urban regions. When regional lower total poverty lines are used instead, the Highlands become prominently the least poor of all regions. Setting common as opposed to regional poverty lines can thus have a crucial impact on poverty rankings and the setting of subsequent poverty alleviation policies. The choice of a lower as against an upper total poverty line also makes a difference. For the whole of Cameroon, the proportion of the Cameroonian population in poverty increases from 43.9% to 68.0% when we move from a common lower total poverty line (fourth line) to a common upper total poverty line (sixth line). Clearly, this changes significantly one's understanding of the incidence of poverty in Cameroon.
These results also implicitly warn that the choice of wellbeing indicators is not neutral to the identification of the poor. In our context, this is because the correlation between calorie intake, food expenditure and total expenditure is imperfect. Table 6.3 indicates, for example, that in bidimensional poverty analyses using any two of these three indicators of wellbeing, around 20% to 25% of the population is characterized as poor in one dimension but non poor in the other. In the first part of 6.3, we note for instance that 11.2% of the population would be judged poor in terms of calorie intake but not poor in terms of food expenditure. Conversely, 9.6% of the population would be deemed non poor in terms of calorie intake but poor in terms of food expenditure. These proportions are slightly higher for the other bidimensional poverty analyses, which compare food with total expenditure poverty, and calorie with total expenditure poverty, respectively.
There are two other popular methodologies for the estimation of poverty lines. The first deals with purely relative poverty lines, which, as we saw above, can be useful to determine the commodities needed for "living without shame" and for participating in the "prevailing consumption level". A relative poverty line is typically set as a somewhat arbitrary proportion of the mean or of some income quantile (often the median). Clearly, such a poverty line will vary with the central tendency of the income distribution, and will not be the same in constant terms across space and time. One possibly awkward feature of the use of a relative poverty line approach is that a policy which raises the income of all, but proportionately more those of the rich, will increase poverty, although the absolute incomes of the poor have risen. Conversely, a natural catastrophe which hurts absolutely everyone will decrease poverty if the rich are proportionately the most hurt3.
E:18.3
Another possibly awkward feature of the use of relative poverty lines is that an improvement in the absolute incomes of some of the poor, with no change
3DAD: PovertyFGT Index.

FEI food poverty line 
Lower nonfood poverty line 
Lower total CBN poverty line 
Upper nonfood poverty line 
Upper total CBN poverty line 
Cameroon 
256 
117 
373 
278 
534 
Yaoundé 
337 
143 
480 
412 
749 
Douala 
408 
181 
589 
588 
995 
Other cities 
347 
152 
499 
385 
732 
Forests 
259 
134 
393 
214 
473 
Highlands 
170 
65 
235 
186 
357 
Savana 
204 
78 
282 
190 
394 

Yaoundé 
Douala 
Other cities 
Forests 
Highlands 
Savana 
Cameroon 
Calorie poverty using common calorie poverty line 
73.4 
67.3 
59.9 
86.5 
64.6 
61.1 
68.1 
Food poverty using common food poverty line 
53.1 
42 
44.5 
82.5 
82.5 
74 
69.5 
Food poverty using regional food poverty lines 
67.9 
67.5 
63.2 
82.5 
61.1 
61.2 
66.4 
Total expenditure poverty using common lower CBN poverty line 
19.2 
16.5 
16 
57.7 
58.7 
49 
43.9 
Total expenditure poverty using regional lower CBN poverty line 
34.7 
38.1 
31.8 
62.6 
19 
29.7 
33.9 
Total expenditure poverty using common upper CBN poverty line 
41.6 
33.4 
36.5 
83.8 
81.1 
78.7 
68 
Total expenditure poverty using regional upper CBN poverty line 
59.6 
59 
58.8 
78.1 
53.1 
55.8 
60.1 
Proportion of region in total population 
7.1 
9.6 
12.7 
18.5 
27.8 
24.2 
100 

Calorie poor 
Calorie nonpoor 
Poor in food expenditure 
58.5 % 
9.6 % 
Non poor in food expenditure 
11.2 % 
20.7 % 

Poor in total expenditure 
Non poor in total expenditure 
Poor in food expenditure 
56.6 % 
9.8 % 
Non poor in food expenditure 
11.3% 
22.2% 

Poor in total expenditure 
Non poor in total expenditure 
Calorie poor 
55.8 % 
12.3 % 
Calorie non poor 
12.2 % 
19.7 % 
in the incomes of the others, may in fact increase poverty. To see why, let η and ς be small positive values and let an income distribution be defined as Q(p) + η(p), with
and with η set initially to 0. Choose z = λμ. The unnormalized FGT index is then given by
Note that , which also says that the relative poverty line λμ increases with an increase in ς. We may then check how increases in η affect overall poverty, for a small ς. For the headcount index, we find
which says that the headcount necessarily increases whenever someone's income increases, regardless of whether that person is poor or rich. When α > 0,
The term A on the righthand side of (6.4) is positive: an increase in incomes increases the relative poverty line and thus tends to increase poverty. When p0 >  F(λμ), the increase in income is beneficial to the rich: the term B is then nil, and poverty then necessarily increases with η. When p0 < F(λμ), the increase in income benefits some of those below the poverty line, and this increase in their absolute living standards explains why the term B is then negative. Whether it is sufficiently negative to offset the positive term A depends 1) on how far below the poverty line these poor are, and 2) on the value of the ethical parameter α. Hence, even with α > 0, relative poverty may increase when growth is beneficial to the poor4.
E:18.3
An alternative poverty line methodology relies uses subjective information on the link between living standards and wellbeing. One source of information comes from interviews on what is perceived to be a sound poverty line, using a question found for instance in Goedhart, Halberstadt, Kapteyn, and Van Praag (1977):
We would like to know which net family income would, in your circumstances, be the absolute minimum for you. That is to say, that you would not be able to make both ends meet if you earned less, (p.510)
The answers are subsequently regressed on the incomes of the respondents. The subjective poverty line is given by the point at which the predicted answer to the minimum income question equals the income of the respondents. The basic intuition for this is that unless someone earns that poverty line, he will not truly know that it is indeed the appropriate minimum income needed to "make both ends meet".
This method is illustrated in some detail on Figure 6.5. Each point represents a separate answer to the above query, namely, the minimum income judged to be needed to make both ends meet as a function of the actual income of the respondents. The filled line shows the predicted response of individuals at a given level of income. For low income levels, this predicted minimum subjective income is well above the respondents' income. The predicted minimum subjective income increases with actual income, but not as fast as income itself. Those with below z* answer that they need more than their own income. Those with income above z* answer that they need on average less than their own income. At z*, which is also where the 45degree line crosses the line of predicted minimum subjective income, that predicted minimum subjective income equals actual income. The subjective poverty line would therefore be estimated here as z*.
One difficulty with the subjective approach is the sensitivity of poverty line estimates to the formulation of the interview questions. Another problem
4DAD: PovertyFGT Index.
comes from the considerable variability in the answers provided, even within groups of relatively socioeconomically homogeneous respondents. The presence of this variability is apparent on Figure 6.5 with points sometimes quite far away from the predicted response line. This variability has some awkward consequences. On Figure 6.5, for instance, an individual at point a is someone who would be judged poor according to the subjective income method since his income falls below z*. An individual at a feels, however, that his income exceeds the minimum income he feels to be needed (point a is to the right of the 45degree line). He would therefore feel that he is not poor. Conversely, someone at point b feels that he is poor, since his reported minimum income exceeds his actual income, but he would be judged not to be poor by the subjective poverty line method.
How, therefore, ought we to interpret this variability? Is it due to measurement errors? If so, then we may probably best ignore it. Is it rather that the link between living standards and true wellbeing varies systematically even within homogeneous groups of people? If so, then we might not want to use incomes or other direct or indirect indicators of wellbeing to classify the poor and the non poor. Instead, we should take individuals at their word on whether they declare themselves to be poor or not. But then, this would clearly raise important practical and incentive problems for the design and the implementation of public policy.
An alternative approach to estimating subjective poverty lines is to ask respondents whether they feel that their income is below the poverty line, without directly asking what the value of that poverty line should be. Answers are coded 0 or 1 — according to whether respondents feel that they are poor or not — alongside the respondents' incomes. The estimate of the poverty line is that which best reconciles the distribution of those answers with that of the respondents' incomes.
This is illustrated in Figure 6.6. Each "dot" is an observation of whether a respondent of a certain income level felt poor (1) or not (0). The working assumption is that respondents compare their income to a common subjective poverty line z*. z* is unobserved and must be estimated. One estimation procedure for z* would be to maximize the likelihood that the respondents' declarations of poverty status correspond to that which would be inferred by comparing z* to their incomes. Said differently, the estimator of z* would minimize the likelihood of observing observations within the ellipses of Figure 6.6. Not everyone with an income below z* says that he is poor; conversely, not everyone above z* says that he is not poor. These "classification errors" would be explained by measurement and/or misreporting errors. Hence, on Figure 6.6, there are "false poor" and "false rich", as shown within the ellipses at the bottom left and at the top right of the Figure. Again, this would run into difficulties if individual preference or need heterogeneity were the true explanation for the "classification errors".
The literature on the estimation of poverty lines is both significant and varied. Note that there is often a sharp distinction in tone and in content between those works which focus on poverty in less developed countries and those which address poverty in more developed economies.
Early reviews of the literature include Goedhart, Halberstadt, Kapteyn, and Van Praag (1977) and Hagenaars and Van Praag (1985). An excellent and comprehensive recent review can be found in Ravallion (1998b) — this chapter has been much influenced by it. Greer and Thorbecke (1986) has been influential in establishing the FEI method of estimating a poverty line. A method based on "basic needs budget" is described in Renwick and Bergmann (1993). The differential effects for poverty measurement of choosing FEI vs CBN methods for estimating poverty lines can be found inter alia in Ravallion and Bidani (1994) and in Wodon (1997a).
Barrington (1997), Fisher (1992), Glennerster (2000) and Orshansky (1988) provide critical reviews of the literature on the setting of the official poverty line in the United States.
The consequences and the issues that surround the choice between absolute and relative poverty lines are discussed in Blackburn (1998) (on the empirical sensitivity of poverty comparisons to that choice), de Vos and Zaidi (1998) (on whether poverty lines should be country specific), Foster (1998) and Zheng (1994) (on the consequences for the choice of poverty indices), and Fisher (1995) and Madden (2000) (on the empirical income elasticity of poverty lines).
Subjective methods for setting poverty lines are discussed and explored in de Vos and Garner (1991) (for comparisons of results between the US and the Netherlands), Pradhan and Ravallion (2000) (on perceived consumption adequacy), Stanovnik (1992) (for an application to Slovenia), Van den Bosch, Callan, Estivill, Hausman, Jeandidier, Muffels, and Yfantopoulos (1993) (for a comparison across 7 European countries), Blanchflower and Oswald (2000) (for reported levels of happiness in Great Britain and in the US), and Ravallion and Lokshin (2002) (for perceptions of wellbeing in Russia).
This page intentionally left blank.
As is wellknown, the assessment of tax and transfer systems draws mainly on two fundamental principles: efficiency and equity. The former relates to the presence of distortions in the economic behavior of agents, while the latter focuses on distributive justice. Vertical equity as a principle of distributive justice is rarely questioned as such, although the extent to which it must be precisely weighted against efficiency is a matter of intense disagreement among policy analysts. A principle of redistributive justice which gathers even greater support is that of horizontal equity, the equal treatment of equals. The HE principle is often seen as a consequence of the fundamental moral principle of the equal worth of human beings, and as a corollary of the equal sacrifice theories of taxation. This chapter and the next cover in turn the measurement of each of these principles.
Let X and N represent respectively gross and net incomes, and let T be taxes net of transfers — the net tax for short. Gross income is pretax and/or pretransfer income, and net income is posttax and/or posttransfer income, that is, N = X  T. For expositional simplicity, we assume in this chapter that gross incomes are exogenous. This is a common assumption in the literature on the measurement of the impact of taxes and transfers, although it can fail to capture the true impact of tax and transfer policies on wellbeing when these taxes and transfers are nonmarginal.
We can expect a part of the net tax to be a function of the value of gross income X. Otherwise, taxes would be lump sum and orthogonal to gross income. We denote this deterministic part by T(X). For several reasons, we also expect T to be stochastically linked to X. In real life, taxes and transfers depend on a number of variables other than gross incomes, such as family
size and composition, age, sex, area of residence, sources of income, type of consumption and savings behavior, and the ability to avoid taxes or claim transfers. Thus, we can think of T as being a stochastic function of X, with
where v is a stochastic tax determinant.
We denote by FX,N(.,.) the joint cumulative distribution function (cdf) of gross and net incomes. Let Qx(p), QN(p) and QT(p) be the pquantile functions for gross incomes, net incomes and net taxes, respectively. Let FN \ x(.) be the cdf of N conditional on gross income being equal to x. The qquantile function for net incomes conditional on a pquantile value for gross incomes is then technically defined as QN(q\p) = inf{s ≥ oFNQx(p)(s) ≥ q} for q ∈ [0, 1], assuming that net incomes are nonnegative. QN(qp) thus gives the net income of the individual whose net income rank is q among all those with gross income equal to Qx(p).
The expected net income of those with Qx(p) is then given by1
and the expected net tax of those with Qx(p) is obtained as
An important descriptive and normative tool for capturing the impact of tax and transfer policies is the concentration curve. As we will see, concentration curves can help capture the horizontal and vertical equity of existing tax and transfer systems. They can also serve to predict the impact of reforms to these systems.
E:18.8.11
The concentration curve for T is2:
where is average taxes across the population. C T (P) shows the proportion of total taxes paid by the p bottom proportion of the population.
In practice, concentration curves are usually estimated by ordering a finite number n of sample observations (X1, N1),..., (Xn, Nn) in increasing values
1DAD: DistributionNonParametric Regression.
2DAD: CurvesConcentration.
of gross incomes, such that X1 ≤ X2 ≤.... ≤ Xn, with percentiles pi = i/n, i = 1,....., n and with Ti = Xi Ni. For i = 1,...n, the sample (or "empirical") concentration curve for taxes (Ti = Xi  Ni)is then defined as
As for the empirical Lorenz curves, other values of CT(P) can be estimated by interpolation.
The concentration curve CN(p) for net incomes is analogously defined as
and typically estimated as
where the Nj have been ordered in increasing values of the associated gross incomes Xj. Note that CN(P) is different from the Lorenz curve of net incomes, LN(p), which is defined as:
Empirically, the Lorenz curve for net income is typically estimated as
but where the observations have been reordered in increasing values of net incomes, with N1 ≤ N2 ≤...≤ Nn. Thus, CN(p) sums up the expected value of net incomes up to gross income percentile p. LN(p) however, sums up net incomes up to a net income percentile p.
Denote as t the average tax as a proportion of average gross income, with t = μT/μX. When ≠0,we can show that
For a positive t, this indicates that the more concentrated are the taxes among the poor (the smaller the difference Lx(p)  CT(P))the less concentrated among the poor will net incomes be. The reverse is true for transfers (negative t): the more concentrated they are among the poor, the more concentrated net income is among the poor. This link will prove useful later in defining indices of tax progressivity.
As for the Lorenz curves and the SGini indices of inequality introduced earlier, we can aggregate the distance between p and the concentration curves C(p) to obtain summary indices of concentration. These indices of concentration are useful to compute aggregate indices of progressivity and vertical equity. More generally, they can also serve to decompose the inequality in total income or total consumption into a sum of the concentration of the components of that total income or consumption, such as different sources of income (different types of earnings, interests, dividends, capital gains, taxes, transfers, etc.) or different types of consumption (of food, clothing, housing, etc.).
To define indices of concentration, we can simply weight the distance p  C(p) by an ethical weight κ(P), of which a popular form is again given by κ(p; p) in equation (4.8). This gives the following class of SGini indices of concentration, IC(p) 3:
An SGini inequality index for a variable can easily be decomposed as a sum of the concentration indices of the component variables that add up to that variable. This can be useful, for instance, for decomposing total income inequality as a sum of concentration indices for the different sources of income (employment, capital, transfers, etc.), or total expenditure inequality as a sum of concentration indices for food and nonfood expenditures, say. For example, let X(1) and X(2) be two types of expenditures, and let X = X(1) + X(2) be total consumption. Let Cx(1) (p) and Cx(2) (p) be the concentration curves of each of the two types of consumption (using X as the ordering variable). The concentration indices for X(c), ICx(c) (p)), c = 1, 2, are as follows:
3DAD: RedistributionCoefficient of Concentration.
Inequality in X can then be decomposed as a sum of the inequality in X(1) and in X(2). The Lorenz curve for total consumption is given by:
which is a simple weighted sum of the concentration curves for each of the two types of consumption. The index of inequality in total consumption is similarly a simple weighted sum of the concentration indices of each of the two types of consumption4:
For given μX(1) and μX(2), the higher the concentration indices ICX(1)(P) and ICX(2)(ρ), the larger the SGini index of inequality in total consumption. Moreover, the higher the share μX(C)/μX of the more highly concentrated expenditure, the higher the inequality in total expenditures5.
E:18.8.32
One possible difficulty with the above is that a component which has the same value for all will be judged by the decompositions in (7.13) and (7.14) to have a zero contribution to total inequality. This is because CX(c) (p) = p for all p and ICX(c) = 0 if component c is equally distributed across all individuals. It may be argued, however, that in such a case contribution c should be seen as contributing negatively to total inequality. Being the same for all, component c indeed decreases the inequality introduced by other components. One way to capture this is to rewrite the decompositions (7.13) and (7.14) in reference to LX(p) and IX(ρ).This gives:
and
The two terms on the left of each of these last two expressions give respectively the contributions of components 1 and 2 to the Lorenz curve and the inequality index of total expenditure X. Those conditions must sum to zero.
4DAD: DecompositionSGini: Decomposition by Sources.
5DAD: DecompositionSGini: Decomposition by Sources.
An alternative approach uses the Shapley value to express inequality in total income as a sum of the contributions of inequality in individual income components. For expositional simplicity, assume again that there are only two income components, X(1) and X(2). Total inequality is then given by I (X(1), X(2)). Suppose that we replace the two income components X(1) and X(2) by their mean value μx1 and μx(2), to yield I(μX(1), μX(2)). Clearly, inequality would be zero after such a substitution. Total inequality can then be expressed as:
An estimate of the contribution of component 1 to total inequality would be given by the second line, and the third line would indicate the contribution of component 2. These estimated contributions are in general dependent upon the order in which the components are replaced by their mean value. The contribution of component 1 could for instance be estimated alternatively as I (X(1),μx2). To solve this order dependency problem, we can use the Shapley value to define the contribution of a component c to total inequality as its expected contribution to inequality reduction when it is added randomly to anyone of the various subsets of components that one can choose from the set of all components. With two components, this gives 6:
Let us for a moment assume that the tax system is nonstochastic (or deterministic), namely, that v equals a constant zero. Suppose also for now that this deterministic tax system does not rerank individuals, or equivalently that
6DAD: DecompositionSGini: Decomposition by Sources.
T(1)(X) ≤ 1. Furthermore, denote the average rate of taxation at gross income X by t(X) with t(X) = T(X)/X7. Assuming no reranking, a net tax
E:18.8.9
(possibly including a transfer or subsidy) T(X) is said to be
locally progressive at X = x if the average rate of taxation increases with X, that is, if t(1)(x) >0;
locally proportional at X = x if the average rate of taxation stays constant with X, that is, if t(1)(x)=0;
and locally regressive at X = x if the average rate of taxation decreases with X, that is, if t(1)(x) < 0.
E: 18.8.10
There are two popular "local" measures to capture the change in taxes and net income as gross income increases. One is the elasticity of taxes with respect to X, also called Liability Progression, LP(X):
LP(X) is simply the ratio of the marginal tax rate over the average tax rate at X. It is possible to show that a tax system is everywhere progressive (namely, t(1)(X) >0 everywhere) if LP(X) > 1 everywhere. The larger this measure at every X, the more concentrated among the richer are the taxes.
One problem with LP(X) is that it is not defined when T(X) = 0, and that it is awkward to interpret when a net tax is sometimes negative and sometimes positive across gross income. Another problem is that it is linked to the relative distribution of taxes, not with the relative distribution of the associated net incomes.
These problems are avoided by the use of a second local measure of progression, called Residual Progression (RP(X)), which is the elasticity of net income with respect to gross income:
Unlike LP(X), RP(X) is well defined and easily interpretable even when taxes are sometimes negative, positive or zero, so long as gross and net incomes are strictly positive. It is then possible to show that a tax system is everywhere progressive (again, this means that t(1)(X)> 0 everywhere) if RP(X) <1 everywhere.
There is a nice link between these measures of progressivity and the redistributive impact of taxes.
7DAD: DistributionNonParametric Regression.
Progressivity and inequality reduction
Assuming no reranking, the following conditions are equivalent:
1 t(1)(x)> 0 for all X;
2 LP(X) > 1 for all X (assuming T(X) > 0);
3 RP(X)< 1 for all x;
4 Lx(p) > CT(P) for all p and for any distribution FX of gross income (assuming μT > 0);
5 LN(p) > LX(P) for all p and for any distribution FX of gross income.
Progressive taxation will thus make the distribution of net incomes unambiguously more equal than the distribution of gross incomes, regardless of that actual distribution of gross incomes. Moreover, if the residual progression for a tax system A is always lower than that of a tax system B, whatever the value of X, then the tax system A is said to be everywhere more residual progressive than the tax system B, and the distribution of net incomes will always be more equal under A than under B, again regardless of the distribution of gross incomes.
Hence, an important distributive consequence of progressive taxation is to make the inequality of net incomes lower than that of gross income. Analogously, proportional taxation will not change inequality, and regressive taxation will increase inequality. The more progressive the tax system, the more inequalityreducing it is. To check whether a deterministic tax system is progressive, proportional or regressive, we may thus simply plot the average tax rate as a function of X and observe its slope. Alternatively, we may estimate and graph its Liability progression or its residual progression at various values of X. To check whether a tax system is more residual progressive (and thus more redistributive) than another one, we simply plot and compare the elasticity of net incomes with respect to gross incomes. All of this can be done using nonparametric regressions of T(X) and N against X.8
Another informative descriptive approach is to compare the share in taxes and benefits to the share in the population of individuals at various ranks in the distribution of gross income. This is most easily done by plotting on a graph the ratios T(X)/μT or for various values of X or p. If these ratios exceed 1, then those individuals with those incomes or ranks pay a greater share of total taxes than their population share. A similar intuition applies when T(·) is a benefit: a ratio T(X)/nr or that exceeds 1 indicates that the benefit share exceeds the population share. If T(X) or increases
8DAD: DistributionNonParametric Regression.
proportionately faster than X or Qx(p), then the tax system is everywhere locally progressive.
A competing descriptive tool is to plot the ratio of taxes over gross income, that is, T(X)/X, perhaps assessed at some rank p to give Such a graph shows how the average tax rate evolves with gross income or ranks. When these ratios increase everywhere with X, the tax is everywhere locally progressive.
Although graphically informative, the above simple descriptive approaches present three main problems. First, if T(1) (X) > 1, the tax system will induce reranking, even if it is a deterministic function of X. As we will see below, reranking (and, more generally, horizontal inequity) decreases the redistributive effect of taxation, besides being of significant ethical concern in its own right.
Second, and more importantly in empirical applications, taxes are typically not a deterministic function of gross income, and randomness in taxes will introduce greater variability and inequality in net incomes than the above deterministic approach would predict. X  T(X) may then be an unreliable guide to the distribution of net incomes, and the above theorems relating local progression measures to global redistributive impact lose a great part of their practical usefulness. Randomness in taxes will also introduce further reranking. These features will reduce the redistributive effect of the tax, and may even in the most extreme cases increase inequality even when the "deterministic trend" of the tax is progressive  even when t(1) (X) > 0.
Third, the actual redistribution effected by taxes depends on the distribution of gross incomes, and not only on the shape of the tax function T. Said differently, the actual redistributive effect of Liability or residual progression will depend on the actual distribution of gross incomes. Arguably, the actual redistribution operated by a tax system is probably of greater interest than its potential impact. A tax may be very locally progressive over some ranges of gross income, but the actual redistributive impact will depend on the interaction of this local progression with the distribution of gross incomes.
To deal with these difficulties, we can use the actual distribution of taxes T and net incomes N (instead of their predicted values T(X) and X  T(X)) to determine whether the actual tax system is really progressive and inequalityreducing. This amounts to combining the local measures of progressivity with the distribution of gross incomes to generate global measures of progressivity.
There are two leading approaches for this exercise. The first is the Taxredistribution (TR) approach, and the second is the Income redistribution (IR) approach. The global definitions of tax progressivity associated to each of these approaches are as follows.
E: 18.8.2
1 For TR progressivity:
(a) A tax T is TRprogressive if9
(b) A benefit B is TRprogressive if 10
(c) A tax T(1) is more TRprogressive than a tax T(2) if 11
(d) A benefit B(1)is more TRprogressive than a benefit B(2) if 12
(e) A tax T is more TR progressive than a benefit B if 13
E:18.8.3
2 For IR progressivity:
(a) A net tax T is IRprogressive if 14
(b) A net tax T(1) is more IRprogressive than a tax (and/or a transfer) T(2) if 15
These two TR and IR approaches are consistent with the use above of Liability and residual progression in a deterministic tax system. If v = 0 in
9DAD: RedistributionTax or Transfer.
10DAD: RedistributionTax or Transfer.
11DAD: Redistribution Tax/Transfer vs Tax/Transfer.
12DAD: Redistribution Tax/Transfer vs Tax/Transfer.
13DAD: Redistribution Transfer vs Tax.
14DAD: RedistributionTax or Transfer.
15DAD: RedistributionTax/Transfer vs Tax/Transfer.
(7.1), and if t(1) (X) > 0 and T(1) (X) ≤ 1 (namely, no reranking), then, whatever the actual distribution of gross incomes, T(X) is both TR and IRprogressive. Furthermore, if LP(1)(X) > LP(2)(X) at all values of X, then the tax system 1 is necessarily more TR progressive than the tax system 2. And if RP(1)(X) < RP(2)(X) at all values of X, then the tax system 1 is necessarily more IR progressive than the tax system 2.
Note that these progressivity comparisons have as a reference point the initial Lorenz curve. In other words, a tax is progressive if the poorest individuals bear a share of the total tax burden that is less than their share in total gross income. As mentioned above, an alternative reference point would be the cumulative shares in the population. This is often argued in the context of state support — the reference point to assess the equity of public expenditures is population share. The analytical framework above can easily allow for this alternative view — for instance, simply by replacing LX (p) by p in the above definitions of TR progressivity. This will make more stringent the conditions to declare a benefit to be progressive, but it will also make it easier for a tax to be declared progressive — to see this, compare (7.21) and (7.22).
Many of the classical texts on the concept, the role and the measurement of tax progressivity date from the 1950's but they are still very relevant today — they include Blum and Kahen Jr. (1963), Musgrave and Thin (1948), Slitor (1948) and Vickrey (1972). See also Okun (1975) for an influential discussion of the interaction between efficiency and equity issues, as well as Pechman (1985) on incidence analysis.
The measurement of progressivity and vertical equity moved forward significantly in the middle of the 1970's following the slightly earlier advances on the measurement of inequality — see, for instance, Fellman (1976), Jakobsson (1976) and Kakwani (1977a) for the link between progressivity and inequality reduction, and Kakwani (1977b), Suits (1977) and Reynolds and Smolensky (1977) for influential indices of tax progressivity and vertical equity. Reviews of the literature can be found in Lambert (1993) and Lambert (2001).
For papers that address general links between progressivity and inequality, see Davies and Hoy (2002) (for the inequalityreducing properties of "flat taxes"), Latham (1993) (for how to assess whether one tax is more progressive than another), Liu (1985) (for tax progressivity and Lorenz dominance), Moyes and Shorrocks (1998) (on the difficulties that arise for the measurement of progressivity when households differ in needs), and Thistle (1988) (for residual progression and progressivity).
Numerous versions of other specific tax progressivity indices have been discussed and presented over the years. These include, for example, Baum (1987), for "relative share adjustment" indices; Blackorby and Donaldson (1984) and
Kiefer (1984), for normativelybased indices of progressivity; Duclos (1995a), Duclos and Tabi (1996) and Duclos (1997b), for indices of the "social performance" of tax progressivity; Duclos (1998), for normative foundations for the Suits progressivity index; Hayes, Slottje, and Lambert (1992), for effective tax progression across percentiles; and Zandvakili (1994), Zandvakili (1995) and Zandvakili and Mills (2001) for the use of progressivity indices derived from Generalized entropy and Atkinson inequality indices.
Linear indices of progressivity derive from the class of linear inequality indices introduced in Mehran (1976). They are discussed inter alia Duclos (2000), Kakwani (1987), Pfahler (1983) and Pfahler (1987).
Some of the literature has also tended to focus on the tension and on the links between local and global progressivity. See, for instance, Baum (1998), Cassady, Ruggeri, and Van Wart (1996), Formby, Seaks, and Smith (1984), Formby, Smith, and Thistle (1987), Formby, Smith, and Thistle (1990) and Formby, Smith, and Sykes (1986). See also Duclos (1995a) for a method for estimating the average residual progression of unevenly progressive tax and benefit systems, and Keen, Papapanagos, and Shorrocks (2000) and Le Breton, Moyes, and Trannoy (1996) for the impact of changes in tax components (such as sizes of allowances) on the progressivity of the overall tax system.
The influence of the "initial distribution" (that of gross incomes) on progressivity measurement is studied in Dardanoni and Lambert (2002) (for a "transplantandcompare" procedure) and in Lambert and Pfahler (1992)  see also the comment by Milanovic (1994a). Yardsticks for assessing the effectiveness of tax and benefit policies in reducing initial inequality are proposed in Fellman, Jäntti, and Lambert (1999) and Fellman (2001).
How income is measured is also of importance for the measurement of progressivity and redistribution. See in particular Altshuler and Schwartz (1996) (for the annual vs a "timeexposure" incidence of the US child care tax credit), Caspersen and Metcalf (1994) (for the annual vs lifetime incidence of valueadded taxes), Creedy and van de Ven (2001) (for the annual vs lifetime incidence of the Australian tax and benefit system), Lyon and Schwab (1995) (for the annual vs lifetime incidence of taxes on cigarettes and alcohol), Metcalf (1994) (for the lifetime incidence of US state and local taxes), Nelissen (1998) (for the lifetime incidence of Dutch social security),
Empirical studies of progressivity and redistribution have been very numerous over the last three decades. They include Bishop, Chow, and Formby (1995a) (redistribution in six LIS countries), Borg, Mason, and Shapiro (1991) (regressivity of taxes on casino gambling), Davidson and Duclos (1997) (progressivity in Canada), Decoster and Van Camp (2001) (the redistributive effect of a shift from direct to indirect taxation in Belgium), Dilnot, Kay, and Norris (1984) (progressivity in the UK between 1948 and 1982), Duclos and Tabi (1999) (redistribution in Canada), Giles and Johnson (1994) (redistribution in the UK), Gravelle (1992) (the redistributive effect of the 1986 US tax reform), Hanratty and Blank (1992) (the comparative poverty effect of redistributive policies in the US and in Canada), Heady, Mitrakos, and Tsakloglou (2001) (the redistributive effect of social transfers in the European Union), Hills (1991) (the redistributive effect of British housing subsidies), Howard, Ruggeri, and Van Wart (1994) (the redistributive effect of taxes in Canada), Khetan and Poddar (1976) (redistribution in Canada), Loomis and Revier (1988) (the redistributive effect of excise taxes), Mercader Prats (1997) (redistribution in Spain, 19801994), Milanovic (1995) (the redistributive effect of transfers in Eastern Europe and in Russia), Morris and Preston (1986) (redistribution in the UK), Norregaard (1990) (tax progressivity in the OECD countries), O'higgins and Ruggles (1981) (redistribution in the UK), O'higgins, Schmaus, and Stephenson (1989) (comparative redistribution of taxes and transfers in seven countries), Persson and Wissen (1984) (the impact of tax evasion on redistribution), Price and Novak (1999) (the regressivity of implicit taxes on lottery games), Ruggeri, Van Wart, and Howard (1994) (the redistributive impact of government spending in Canada), Ruggles and O'higgins (1981) (the redistributive impact of government spending in the US), Schwarz and Gustafsson (1991) (redistribution in Sweden), Smeeding and Coder (1995) (redistribution in 6 LIS countries), van Doorslaer, Wagstaff, van der Burg, Christiansen, Citoni, Di Biase, Gerdtham, Gerfin, Gross, and Hakinnen (1999) (the redistributive impact of health care financing in 12 OECD countries), Vermaeten, Gillespie, and Vermaeten (1995) (the redistributive impact of taxes in Canada, 19511988), Wagstaff and van Doorslaer (1997) (the redistributive impact of health care financing in the Netherlands), Wagstaff, van Doorslaer, Hattem, Calonge, Christiansen, Citoni, Gerdtham, Gerfin, Gross, and Hakinnen (1999) (the redistributive impact of personal income taxation in 12 OECD countries), Younger, Sahn, Haggblade, and Dorosh (1999) (tax incidence in Madagascar).
Benefit incidence analysis is also regularly carried out in less developed economies  see, for instance, Lanjouw and Ravallion (1999) for the role of differentiated "program capture" in explaining the evolution of the incidence of benefits, Sahn, Younger, and Simler (2000) for a dominance analysis of benefit incidence in Romania, van de Walle (1998a) for a discussion of general issues, and Wodon and Yitzhaki (2002) for the role of program allocation rules in the study of benefit incidence.
There have been numerous papers decomposing the Gini indices into sums of contributions of income sources. These include Aaberge, Bjorklund, Jantti, Pedersen, Smith, and Wennemo (2000), Achdut (1996), Cancian and Reed (1998), Gustafsson and Shi (2001), Keeney (2000), Leibbrandt, Woolard, and Woolard (2000), Lerman (1999), Lerman and Yitzhaki (1985), Morduch and Sicular (2002), Podder (1993), Podder and Mukhopadhaya (2001), Podder and Chatterjee (2002), Reed and Cancian (2001), Shorrocks (1982), Silber (1989), Silber (1993), Silber (1989), Sotomayor (1996), Wodon (1999), and Yao (1997).
In this chapter, we examine in more detail a more neglected aspect of the notion of redistributive justice: horizontal equity (HE) in taxation (including negative taxation).1 Two main approaches to the measurement of HE are found in the literature, which has evolved substantially in the last thirty years. The classical formulation of the HE principle prescribes the equal treatment of individuals who share the same level of welfare before government intervention. HE may also be viewed as implying the absence of reranking: for a tax to be horizontally equitable, the ranking of individuals on the basis of pretax welfare should not be altered by a fiscal system. Most of the analysis below will involve ethical indices. We will see that, depending on the choice of the underlying social welfare function or inequality index, horizontal inequity will be captured either by a "classical" horizontal inequity index or by a "reranking" one.
Why should concerns for horizontal equity influence the design of an optimal tax and transfer system? Several answers have been provided, using either of two approaches. The traditional or "classical" approach defines HE as the equal treatment of equals (see Musgrave (1959)). While this principle is generally well accepted, different rationales are advanced to support it. First, a tax which discriminates between comparable individuals is liable to create resentment and a sense of insecurity, possibly also leading to social unrest.
Second, the principles of progressivity and income redistribution, which are key elements of most tax and transfer systems, are generally undermined
1This chapter draws extensively from Duclos, Jalbert, and Araar (2003), where more details can be found.
by horizontal inequity (HI)  as we shall see in our own treatment below. This has indeed been one of the main themes in the development of the reranking approach in the last decades. Hence, a desire for HE may simply derive from a general aversion to inequality, without any further appeal to other normative criteria. HI may moreover suggest the presence of imperfections in the operation of the tax and transfer system, such as an imperfect delivery of social welfare benefits, attributable to poor targeting or to incomplete takeup. It can also signal tax evasion, which can inter alia cost the government significant losses of tax revenue.
Third, HE can be argued to be an ethically more robust principle than VE. VE asks for the reduction of welfare gaps between unequal individuals. Depending on the retained specification of distributive fairness, the strength of the requirements of vertical justice can vary considerably, while the integrity of the principle of horizontal equity remains essentially invariant. This has led several authors to advocate that HE be treated as a separate principle from VE, and thus that HE be one of the objectives over which optimal tradeoffs are assessed for the setting of tax policy.
The theory of relative deprivation also suggests that people often specifically compare their relative individual fortune with that of others in similar or close circumstances. The first to formalize the theory of relative deprivation, Davis (1959), expressly allowed for this by suggesting how comparisons with similar vs dissimilar others lead to different kinds of emotional reactions; he used the expression "relative deprivation" for "ingroup" comparisons (i.e., for HI), and "relative subordination" for "outgroup" comparisons (i.e., for VE) (Davis 1959, p.283). Moreover, in the words of Runciman (1966), another important contributor to that theory, "people often choose reference groups closer to their actual circumstances than those which might be forced on them if their opportunities were better than they are" (p.29).
In a discussion of the postwar British welfare state, Runciman also notes that "the reference groups of the recipients of welfare were virtually bound to remain within the broadly delimited area of potential fellowbeneficiaries. It was anomalies within this area which were the focus of successive grievances, not the relative prosperity of people not obviously comparable" (p.71). Finally, in his theory of social comparison processes, Festinger (1954) also argues that "given a range of possible persons for comparison, someone close to one's own ability or opinion will be chosen for comparison" (p.121). In an income redistribution context, it is thus plausible to assume that comparative reference groups are established on the basis of similar gross incomes and proximate pretax ranks, and that individuals subsequently make comparisons of posttax outcomes across these groups. Individuals would then assess their relative redistributive illfortune in reference groups of comparables by monitoring inter alia how they fare compared to similar others, and by assessing whether they
are overtaken by or overtake these comparables in income status, thus providing a plausible "microfoundation" for the use of HE as a normative criterion.
This suggests that comparisons with close individuals (but not necessarily exact equals) would be at least as important in terms of social and psychological reactions as comparisons with dissimilar individuals, and thus that analysis of HI and reranking in that context should be at least as important as considerations of VE. It also says that, although classical HI and reranking are both necessary and sufficient signs of HI, they are (and will be perceived as) different manifestations of violations of the HE principle.
The value of studying classical HI has nonetheless been questioned by a few authors, who reject the premise that the initial distribution is necessarily just, or who point out that utilitarianism and the Pareto principle may justify the unequal treatment of equals (as discussed above). A number of authors have also expressed dissatisfaction with the classical approach to HE because of the implementation difficulties it was seen to present. Indeed, since no two individuals are ever exactly alike in a finite sample, it was argued that analysis of equals had to proceed on the basis of groupings of unequals which were ultimately arbitrary. The proposed alternative was then to link HI and reranking and to note that the absence of reranking implies the classical requirement of HE. For instance, Feldstein (1976), p.94, argues that
the tax system should preserve the utility order, implying that if two individuals would have the same utility level in the absence of taxation, they should also have the same utility level if there is a tax.
Various other ethical justifications have also been suggested for the requirement of noreranking. For instance, King (1983) argues in favor of adding (for normative consistency) the qualification "and treating unequals accordingly" to the classical definition of HE. It then becomes clear that classical HE also implies the absence of reranking. Indeed, if two unequals are reranked by some redistribution, then it could be argued at a conceptual level that at a particular point in that process of redistribution, these two unequals became equals and were then made unequal (and reranked), thus violating classical HE. Hence, from the above, it would seem that (quoting again from King 1983, p. 102) "a necessary and sufficient condition for the existence of horizontal inequity is a change in ranking between the ex ante and the ex post distributions". We thus follow each of the approaches in turn, starting with reranking.
We first show how to decompose the net redistributive effect of taxes and transfers into vertical equity (VE) and reranking (RR) components. The VE effect measures the tendency of a tax system to "compress" the distribution of net incomes, which is linked to the progressivity of the tax system. The RR term contributes negatively to the net redistributive effect of the tax system.
The use of Lorenz and concentration curves and of the associated SGini indices of inequality and redistribution will enable this integration of reranking and horizontal inequity.
Recall first the definition of a concentration curve for net income in (7.6). We can show that CN(p) will never be lower than the Lorenz curve LN(P), and will be strictly greater than LN(p) for at least one value of p if there is "reranking" in the redistribution of incomes. (In a continuous distribution, a sufficient condition for reranking is that v in (7.1) is not degenerate, namely, that it is not a constant.) Intuitively, CN(p) cumulates some net incomes whose percentiles in the net income distribution exceed p. These are net incomes that exceed and QN(P) Such high incomes are nevertheless possible, however, due to the stochastic term v in (7.1). LN(p) only cumulates the net incomes which equal QN(P) or less. Hence, CN(P) ≥ LN(p). This can also be seen by comparing the estimators in equations (7.7) and (7.9). In (7.9), the observations of Nj are cumulated in increasing values of Nj, but in (7.7), the observations of Nj ate cumulated in increasing values of Xj, which means that some higher values of Nj may be cumulated before some lower ones.
It is therefore straightforward to conclude that a net tax T will cause reranking (and hence horizontal inequity) if and only if CN(p) > LN(P) for at least one value of p ε]0, 1[. The distance CN(p)  LN(P) can therefore be used
E: 18.8.5
as an indicator of reranking2. A natural SGini index of rerankingind is then obtained as a weighted distance between the two curves:
Denoting ICN(p) as the index of concentration of net incomes (recall (7.11)), this index of reranking can also be obtained as
As for comparisons of inequality and concentration, it is often useful to summarize the progressivity, vertical equity, horizontal inequity as well as the redistributive effect of taxes and transfers into summary indices. We can do this by weighting the differences expressed above by the weights k(p; p) of the SGini indices to obtain SGini indices of TR progressivity (IT(p)), IR progressivity and vertical equity (IV(p)), reranking (RR(p)), and redistribution (IR(p)):
2DAD: CurvesLorenz and DAD: CurvesConcentration.
These indices can also be computed as differences between SGini indices of inequality and concentration:
Many of these indices have first been proposed with p = 2, which corresponds to the case of the standard Gini index. IT(p = 2) is known as the Kakwani index of TR progressivity3, IV(p = 2) is known as the Reynolds
E:18.8.4
Smolensky index of IR progressivity and vertical equity, and RR(p = 2) is known as the AtkinsonPlotnick index of reranking.
The difference between the Lorenz curve of net and gross incomes is given by:
The larger this difference, the more redistributive is the tax and benefit system. Alternatively, the net redistribution can be expressed in terms of Sindices4:
E:18.8.6
3DAD: InequalityGini/SGini Index and DAD: RedistributionCoefficient of Concentration.
4DAD: InequalityGini/SGini Index and DAD: RedistributionGoefficient of Concentration.
The first term VE in each of the above two expressions is clearly linked to the definition of IR progressivity in equation (7.26). As shown in equation (7.10), it can also be expressed in terms of TR progressivity when t ≠ 0:
and, using Sindices,
Furthermore, if there is more than one tax and/or benefit that make up T, we can decompose total VE as a sum of the IR and TR progressivity of each tax and transfer. Say that there are J such taxes or benefits. Let t(j) be the (overall) average tax rate of the tax T(j) with j = 1,..., J, such that and let CT(j) (p) and CN(j) (p) be the concentration curves of net income and taxes corresponding to tax T(j), with N(j) = X — T(j). Then, we have
and
CN(j) (p)— LX(P) and IX(P)—ICN(p) capture the vertical equity of tax or transfer j at percentile p, and again can be easily seen to be an element of the definition of IR progressivity. Each of these VE contributions can also be expressed as a function of TR progressivity at p (when t(j) ≠ 0):
or, using SGini indices of IR progressivity, as a function of SGini indices of TR progressivity:
The second term on the righthand side of (8.11) and (8.12) is the redistributionreducing reranking effect. As is well known from the literature on reranking (see Atkinson, 1979, and Plotnick, 1981, for instance), taking into account reranking when using rankdependent inequality indices increases measured inequality and decreases the redistributive effect of taxation,
and this explains why IN generally exceeds ICN,and also why the difference can be interpreted as the impact of reranking on the net redistributive effect of taxation.
To interpret that second term, we may also think of individuals resenting being outranked by others, but enjoying outranking others, and then assess their net feeling of resentment by the amount by which the net income of the richer (than themselves) actually exceeds what the net income of the richer class would have been had no "new rich" displaced "old rich" in the distribution of net incomes. We can then show that μN(IN(P) — ICN(p)) is the expected net income resentment of the poorest person in samples of p — 1 randomly selected individuals, and thus that RR(p) is an ethicallyweighted indicator of such net resentment in the population.
We now turn to the measurement of classical horizontal equity, defined again as "the equal treatment of equals".
One natural avenue for measuring whether equals are treated equally is to estimate the variability of taxes and net incomes conditional on some initial value of gross income. We may, for instance, wish to estimate the conditional variability of T at some value of X. Alternatively, and perhaps better for expositional purposes, we may want to show that conditional variability over a range of percentiles p of gross income X, and we may thus want to estimate for example the conditional variance of T at gross income:5
E:18.8.7
Recent work has, however, attempted to make the measurement of classical HI flow from ethical (as opposed to descriptive or statistical) foundations. We show how this can be done using the popular Atkinson social welfareatk function W(t) introduced in (4.37). For the distribution of net incomes, this social welfare function equals:
Recall that the expected net income of those at rank p in the distribution of gross income is given by Hence, if the tax system were horizontally
5DAD: DistributionConditional Standard Deviation.
equitable and if all individuals at rank p in the distribution of gross income were granted in net incomes, the local level of utility would be U (N(p); e) and netincome social welfare would equal
The expected net income utility of those at rank p in the distribution of gross income is, however, equal to
If, instead of U(N(p); e), we assigned individuals at rank p their expected net income utility U(p; e), social welfare would equal
is social welfare using ex ante expected net income; is social welfare using ex ante expected net income utility. By the concavity of the utility function, we have that and this difference captures the local utility cost of net income uncertainty at p. Hence, we also have that a feature which we can use to capture the global social welfare cost of HI and its impact on redistribution.
To show the social welfare cost of HI and its impact on redistribution, we can follow either of two approaches. Recall that we have just provided two locally horizontallyequitable tax systems:
one in which each individual at rank p in the distribution of gross incomes receives JV(p) and utility U(N(p); e),
and one in which each of these individuals receives U(p; e).
In the first case, but mean income is the same under the two distributions N(p) and since Hence, a consequence of HI is to increase inequality and to decrease the redistributive fall in inequality brought about by tax and benefit systems. This is further developed in Section 8.3.2.
The second case imposes a horizontallyequitable local distribution of utility U(p) that equals the ex ante expected local utility. Compared to the actual distribution of net incomes, this reduces inequality but maintain the overall level of social welfare. Hence, it must be that average income under U(p) is lower than under N(p). It also implies that the cost of inequality is lower with U(p). This is further developed in Section 8.3.3.
Let the equally distributed equivalent (EDE) incomes for WN(∈), and be , respectively. As before, inequality can be measured by the differences between those ξ and the corresponding μ, as a proportion of μ Now observe that
since and . Hence, .HI increases inequality. The overall redistributive change in inequality that results from the effect of taxes and transfers can then be expressed as
Note also that, by (4.35), (8.25) is equivalent to when the means of X and N are the same.
Hence, using (8.25) we obtain the following decomposition of the net redistributive change iri inequality6:
VE represents the decrease in inequality yielded by a tax which treats equals equally. Thus, VE can be interpreted as a measure of the underlying vertical equity of horizontallyequitable net taxes measures the fall in redistribution attributable to the unequal posttax treatment of pretax equals. The excess of over is due to the appearance of posttax income inequality within groups of pretax equals.
In the above changeininequality approach, average income is kept the same while comparing distributions of actual and horizontally equitable net incomes. Social welfare and inequality do, however, vary across the distributions of N(p) and In the second approach, the costofinequality approach, social welfare is kept the same across the distributions being compared but the mean income required to attain this level of welfare varies. Each element of the decomposition in this section thus corresponds to a difference in means at equal social welfare
6DAD: RedistributionDuclos & Lambert (1999) and DAD: RedistributionDuclos, Jalbert & Araar (2003).
The cost of inequality in the distribution of net income can be expressed as:
Recall that represents the level of per capita net income that society could use for the elimination of inequality with no loss of social welfare.
Let represent the cost of inequality subsequent to a flat (or proportional, and thus inequality neutral) tax on gross incomes that generates the same level of social welfare as the distribution of net incomes. Denote the average income under this welfareneutral flat tax by μF The net effect of redistribution on the cost of inequality then becomes:
Since and since we also have
which is positive if The more progressive the net tax system, the greater the value of . If the net tax system is progressive, the greater the value of e, the greater the redistributive fall in the cost of inequality.
We then write the decomposition of the total variation in the cost of inequality as 7:
The redistributive fall in the cost of inequality then decomposes into two effects.
First, is the cost of inequality under a (horizontallyequitable) certaintyequivalent level of net income at all ranks p. This certaintyequivalent net income is given by at rank p. Hence, for constant social welfare, an horizontallyequitable tax system corresponds to a distribution of to each individual at pretax percentile p.
Second, in (8.30) measures the difference in the cost of inequality of two horizontally equitable tax systems, the first being a flat tax system, and the second granting everyone his certainty equivalent level of net income, with both systems yielding the same level of social welfare WN. is positive if the tax system is progressive in an ex ante,
7 DAD: RedistributionDuclos & Lambert (1999) and DAD: RedistributionDuclos, Jalbert & Araar (2003).
certaintyequivalent, sense. In such a case, the distribution across percentiles of the certaintyequivalent net incomes is less inequality costly than the distribution of gross incomes.
We may also wish to know at which percentile or for which population group HI is more pronounced, and by how much it contributes to total classical HI. For this, define the local cost of classical violations of HE at p as:
This is the "riskpremium" of net income uncertainty at percentile p, and it is thus a moneymetric cost of local classical HI at p. It is then possible to show that aggregating (8.31) using population weights yields the global index of total classical HI in (8.30):
The literature on horizontal inequity has evolved very significantly over the last 25 years. Recent literature surveys can be found in Jenkins and Lambert (1999), Lambert and Ramos (1997a) and Lambert (2001) (see also the comment by Plotnick (1999) and the earlier reviews of Musgrave (1990) and Plotnick (1985)). See also Balcer and Sadka (1986), Feldstein (1976), Hettich (1983), Lambert and Yitzhaki (1995) and Stiglitz (1982) for a treatment of horizontal equity as a separate principle from vertical equity, and Kaplow (1989), Kaplow (1995) and Kaplow (2000) for a critique of the principle of horizontal inequity.
The early reranking approach was much influenced by Atkinson (1979), Plotnick (1981) and Plotnick (1982) (for the RR(2) index), and King (1983) (for a normative link between inequality, mobility and reranking). See also Chakravarty (1985) for normative links between inequality and reranking, Dardanoni and Lambert (2001) for a statisticallybased look at the association between gross and net incomes, Duclos (1993) for the general form of the IR(p) indices, Jenkins (1988a) for a "withingroup" horizontal equity focus, Kakwani and Lambert (1999) for a Hirelated analysis of tax discrimination, Kakwani and Lambert (1998) for an axiomatic construction of equity measures, Rosen (1978) for a (rare) utilitybased evaluation of horizontal inequity, and Lerman and Yitzhaki (1995) for reasons for which reranking may decrease inequality.
Classical horizontal equity has seen extensive developments particularly in the last 10 years: see, for instance, Aronson, Johnson, and Lambert (1994), Aronson and Lambert (1994), Aronson, Lambert, and Trippeer (1999) and van de Ven, Creedy, and Lambert (2001), for the use of the Gini for calculating both reranking and classical horizontal inequity; Duclos and Lambert (2000), for a costofinequality approach; and Auerbach and Hassett (2002) and Lambert and Ramos (1997b), for a changeininequality approach.
Empirical enquiries into the extent of horizontal inequity have also been relatively numerous. They include inter alia Ankrom (1993) for comparative Swedish, British and American evidence, Berliant and Strauss (1985) for the US federal income tax system, Bishop, Formby, and Lambert (2000) for the effects of noncompliance and tax evasion, Creedy (2001) and Creedy (2002) for the impact of nonuniform indirect taxes on horizontal inequity in Australia, Creedy and van de Ven (2001) for the impact on measured horizontal inequity of using different equivalence scales and of using annual vs lifetime income, Decoster, Schokkaert, and Van Camp (1997) for indirect taxation and horizontal inequity in Belgium, Duclos (1995b) for the role of imperfections in poverty alleviation programs, Jenkins (1988b) and Nolan (1987) for the extent of reranking in the UK, Sa Aadu, Shilling, and Sirmans (1991) for whether the treatment of capital gains on owneroccupied housing matters for horizontal inequity, and Stranahan and Borg (1998) for whether an implicit "lottery tax" is a source of horizontal inequity.
The advances in the measurement of horizontal inequity have also led to a desire to decompose the overall measurement of redistribution as a function of progressivity, vertical equity, reranking and classical horizontal inequity. This is done inter alia in Duclos (1993) (with the SGini), Duclos (1995b) (with redistributive imperfections), Kakwani (1984) and Kakwani (1986) by using the Gini index but not attempting to measure classical horizontal inequity; and in Aronson, Johnson, and Lambert (1994), Aronson and Lambert (1994), van Doorslaer, Wagstaff, van der Burg, Christiansen, Citoni, Di Biase, Gerdtham, Gerfin, Gross, and Hakinnen (1999) (for health financing in 12 OECD countries), Wagstaff and van Doorslaer (1997) (for health financing in the Netherlands), Wagstaff, van Doorslaer, Hattem, Calonge, Christiansen, Citoni, Gerdtham, Gerfin, Gross, and Hakinnen (1999) (for personal income taxes in 12 OECD countries), all using the Gini index and incorporating both reranking and classical horizontal inequity. See also Wagstaff and van Doorslaer (2001) for a decomposition of total tax progressivity in components such as the progressivity of tax credits, marginal tax rates, allowances and deductions.
This page intentionally left blank.
We have, up to now, focussed mostly on measuring and comparing cardinal indices of poverty and equity. As discussed in Chapter 4, this has several expositional advantages. The greatest of these advantages is probably that of focussing on only one (or a few) numerical assessments of poverty and equity. It is then relatively straightforward to compare poverty and equity across distributions just by comparing the values of these cardinal indices. The conclusions are then (seemingly) "clearcut".
There are, however, important reasons to consider instead ordinal comparisons of poverty and equity. The most important one is that comparisons of cardinal poverty and equity indices (comparisons across time, regions, sociodemographic groups, or comparisons of policy regimes, for instance) may be disturbingly sensitive to the choice of indices and poverty lines. For instance, we might find for some poverty lines and indices that poverty is greater in a region A than in a region B, but we then find the opposite for other lines and indices. We could support the introduction of a particular fiscal policy or macroeconomic adjustment program for some social welfare indices, but could be in doubt as to whether the same support would be warranted with other indices. Since there is rarely unanimity as to the right choice of poverty lines and distributive indices, it is clear that such sensitivity can seriously undermine one's confidence in comparing distributions or in making policy recommendations.
To see this better in the context of poverty comparisons, consider the hypothetical example of Table 9.1. The second, third and fourth lines in the table show the incomes of three individuals in two hypothetical distributions, A and B. Thus, distribution A contains three incomes of 4, 11 and 20 respectively. The bottom 3 lines of the table show the value of the two most popular indices of poverty, the headcount F(z) and the average poverty μ,g(z) indices, at two alternative poverty lines, z = 5 and z = 10. Recall from Section 5.1.2 that the poverty headcount gives the proportion of individuals in a population whose income falls underneath a poverty line. At a poverty line of 5, there is only one such person in poverty in distribution A, and the headcount is thus equal to 0.33. The average poverty gap index is the sum of the distances of the poor's incomes from the poverty line, divided by the total number of people in the population. For instance, at a poverty line of 10, there are 2 people in poverty in B, and the sum of their distances from the poverty line is (106)+(109)=5. Divided by 3, this gives 1.66 as the average poverty gap in B for a poverty line of 10.
At a poverty line of 5, the headcount in A is clearly greater than in B, but this ranking is spectacularly reversed if we consider instead the same headcount index but at a poverty line of 10. The ranking changes again if we use the same poverty line of 10 but now focus on the average poverty gap μg(z): Clearly, the poverty ranking A and B can be quite sensitive to the precise choice of measurement assumptions.

Distribution A 
Distribution B 
First individual's income 
4 
6 
Second individual's income 
11 
9 
Third individual's income 
20 
20 
F(5) 
0.33 
0 
F(10) 
0.33 
0.66 
μy(1O) 
2 
1.66 
The alternative to comparing the value of one or a few cardinal indices is to check whether rankings of poverty and equity are valid for a class of ethical judgments. These classes are defined over classes of indices as well as over ranges of poverty lines (for poverty comparisons). In other words, we do not wish to quantify poverty or equity. We only want to determine whether poverty and equity is higher or lower in one distribution than in another, for a class of ethical judgments. When inferred, an ordinal ranking of poverty and equity across distributions or policies establishes the sign of the differences across these distributions or policies of everyone of the cardinal poverty and equity indices of that class. Note that it can say only whether poverty and equity is higher in one distribution or for one policy than for another, but not by how much. In the article in which he introduces his famous inequality index (or "concentration ratio"), Gini (1914) criticizes the curve introduced earlier by Lorenz (1905) exactly along those lines:
This graphical approach presented two drawbacks (...):
a) it does not provide a precise measurement of concentration
b) it does not allow to assess, not even in some circumstances, when or where concentration is stronger. In fact, if two curves cross each other (...), it is not always possible to say if one denotes a stronger concentration than the other, (translated in Gini 2005, p. 24.)
Ordinal comparisons of poverty do not, therefore, provide precise numerical values to compare with numerical indicators of other aspects or effects of government policy, such as the policy's administrative or efficiency cost. This is seemingly their main defect. It is arguably also their greatest advantage. As seen above in the context of Table 9.1, differences in simple poverty indices can be deceptive when it comes to ranking distributions. They can also quantify deceptively differences across distributions. To illustrate this, consider Table 9.2 with distributions A and B and a poverty line z = 1. The three FGT poverty indices agree that poverty has not increased in moving from A to B. But the quantitative change in poverty varies significantly with the value of α. With the poverty headcount, poverty remains the same, but the average poverty gap falls by 33% and the index falls by 56%.
Distributions 
Firsta 
Secondb 

A 
0.25 
2 
0.5 
0.375 
0.28125 
B 
0.5 
2 
0.5 
0.25 
0.125 
Differencec 


no change 
fall of 33% 
fall of 56% 
aFirst individual's income.
bSecond individual's income.
cChanges in poverty from A to B.
A focus on ordinal comparisons can save most of the considerable energy and time often spent on selecting poverty lines and poverty indices. It can avoid inter alia the difficult debate on the choice of appropriate theoretical and econometric models for estimating poverty lines. It can also escape arguments on the relative merits and properties of the many distributive indices that have been proposed in the social welfare literature, and of which the previous chapters introduced only a few. Again, this is because of ordinal distributive comparisons simply order distributions, and for this, differences in numerical indices do not need to be estimated. For instance, we will see later in Section 10.1 that we can order robustly distributions A and B in Table 9.1 for all "distributionsensitive" poverty indices and for any choice of poverty line. If such an ordering is considered sufficiently strong and informative, then, in comparing A and B, we can effectively stop quibbling on whether we should use the Watts index or the average poverty gap as a poverty index, and on whether the poverty line should be 5 or 10.
In short, ordinal poverty comparisons can sometimes be robust to the choice of measurement assumptions, since they will sometimes be valid for wide classes and ranges of such assumptions. When the problem is simply of resolving which of two policies will better alleviate poverty, or determining which of two distributions displays the greatest level of social welfare, or assessing which of two distributions is the most equal, ordinal comparisons can sometimes be sufficiently informative, and cardinal estimates will then not be needed.
As we will see in detail below, ordinal comparisons of poverty and equity involve using classes of distributive indices. It is useful to define these classes by referring to "orders of normative (or ethical) judgements", an order being denoted as s = 0, 1, 2,.... An ethical judgement of order s thus serves to define a class of indices also of order s. Whether an ordering of poverty and equity is valid for all of the indices that are members of a class of order s is empirically tested through dominance tests, which happen to be convenient variants of wellknown stochastic dominance tests also of order s. When two dominance curves of a given order do not intersect, all indices that obey the ethical principles associated to this order of dominance then rank identically the two distributions. Hence, a dominance test of order s serves to test whether some distributive ranking is valid for all of the indices of a class of order s, and that class of order s can be interpreted through the use of ethical judgements of the same order s.
A first natural property of normative judgements is that a society should be judged improved whenever the income of one of its members increases and no one else's income decreases. For poverty, this would mean that indices of poverty should (weakly) fall whenever someone's income increases, everything else being the same. ("Weakly fall" means that the index should at the very least not increase following the change, and conversely for "weakly increase". This caveat applies to all of the ethical statements considered in this book.) For social welfare comparisons, this would imply that social welfare indices should increase following this improvement in someone's income. Such indices thus obey the Pareto principle: they must respond favorably to Paretoimproving changes in the distribution of income.
To see this formally, consider the case of a social welfare function, W(y), that depends on a vector y = (y1,..., yn) of n income levels.
Let y = (y1, ..., yn),η > 0 be any positive constant, and = (y1,..., yj + η,..yn). Then the social welfare function W obeys the Pareto principle if and only if W (y) ≤ W () for all possible pairings of y and .
Because the ethical condition imposed by the Pareto principle is very weak, we can consider all of the indices that obey that principle to be members of a class of ethical order 0. The poverty indices belonging to a class of order 0 would for instance all fall whenever someone's income increases, everything else being the same. Note that the case of relative poverty might seem to provide an exception to this principle, since an increase in someone's income could increase the relative poverty line and possibly also increase the poverty index. To deal with this possible exception, it is best to think of the poverty line as constant in the current discussion of ethical principles.
All of the indices which obey the Pareto ethical condition then belong to (poverty or social welfare) classes of order s = 0. It has, however, long been recognized that searches for strict Pareto improvements in distributions of incomes are generally doomed to failure, because of fundamental randomness in economic status and because of strong heterogeneity in preferences, endowments and markets. For a distributive change to be strictly Pareto improving, it must indeed not decrease anyone's income, whatever one's peculiar circumstances. This is unlikely ever to be empirically observable, even if we were to focus only on those with incomes below some poverty line. Besides, checking for Pareto improving temporal changes would require the use of panel data in order to observe individualspecific changes in incomes. Such panel data are rare, and even if we had access to them, they would still not enable us to infer Pareto improvements over an entire population (as opposed to only over an available sample). To be valid, searches for strict Pareto improvements also plausibly require no change in population size and composition, a difficulty with which we deal below through the use of the anonymity and population invariance principles.
It is thus natural and logical to consider ethical principles of order higher than that of the Pareto principle. In the light of the above, a plausible higherorder ethical judgement would require that the distributive indices be anonymous in the incomes of the individuals. That is, ceteris paribus, whether it is an individual named α rather than b that enjoys some level of income should not affect the value of a distributive index. It also follows from this property that interchanging two income levels should not affect distributive indices: these indices thus obey the symmetry or anonymity principle. Formally, we have (for a social welfare function W):
Let M be an n × n permutation matrix (a permutation matrix is composed of 0's and 1's, with each row and each column summing to 1) and let = My'. Then the social welfare function W obeys the anonymity principle if and only if W (y) = W () for all possible pairings of y and
Clearly, this principle would not be acceptable for an index of horizontal equity, but it would seem relatively uncontroversial for comparing inequality, social welfare or poverty across anonymous distributions.
There is another principle that we have implicitly imposed since the beginning of this book and that also goes beyond the Pareto principle. It is usually called the population invariance principle, and it simply states that adding an exact replicate of a population to that same population should not affect distributive comparisons. For a social welfare function W, we thus have:
Let be a vector of size 2n, with and with yj = yj, j = 1,..., n. Then the social welfare function W obeys the population invariance principle if and only if W (y) = W () for all possible pairings of y and .
As indicated on page 40, imposing this principle simplifies exposition significantly by enabling the use of quantiles and the normalization of population size to 1. The population invariance principle is thus implicitly imposed everywhere throughout the book.
Firstorder classes of distributive indices then regroup all of the indices that show a social improvement when the income at some percentile in the population increases and when no other income changes. These indices have properties that are analogous to those of Paretian indices: ceteris paribus, the larger the individual incomes, the better off is society. They are in addition symmetric in income since they obey the anonymity principle.
Even with the above anonymity constraint, it is likely that some of the firstorder distributive indices will clash in their distributive ranking. Some of the firstorder poverty indices could declare a policy reform to worsen poverty, while others might indicate that the reform improves poverty. To resolve this ambiguity, we may move to a secondorder class of distributive indices. As above, this is done by constraining distributive indices to obey additional ethical principles.
To do this, assume that distributive indices must show a social improvement whenever a meanpreserving redistributive transfer from a richer to a poorer individual occurs. This corresponds to imposing the wellknown PigouDalton principle on social judgements. To see this formally, consider again the case of a social welfare function W(y).
Let η > 0 be any positive constant, and let = (y1,... , yj + η,..., yk – η,..., yn), with yj + η ≤ yk – η. Then the social welfare function W obeys the PigouDalton principle if and only if W (y) ≤ W () for all possible pairings of y and .
The secondorder classes of distributive indices thus contain those indices that have a greater ethical preference for the poorer than for the richer. They display a preference for equality of income and are therefore said to be distributionsensitive. For instance, all other things being the same, the more equal the distribution of income among the poor, the lower the level of poverty. Ceteris paribus, if a transfer from a richer to a poorer person takes place, all secondorder social welfare indices will increase and all secondorder inequality and poverty indices will fall. Note again that all indices that belong to a secondorder class of poverty and welfare indices also belong to the firstorder class of relevant indices.
There are often sound ethical reasons to be socially more sensitive to what happens towards the bottom of the distribution of income than higher up in it. We may thus be less concerned about a "bad" disequalizing transfer higher up in the distribution of income than lower down. To make this more precise, imagine four levels of income, for individuals 1, 2, 3, and 4, such that y2 – y1 = y4  y3 > 0 and y1 < y3. Let a marginal transfer of $1 of income be made from individual 2 to individual 1 (an equalizing transfer) at the same time as an identical marginal $1 is transferred from individual 3 to individual 4 (a disequalizing transfer). This is called in the literature a "favorable composite transfer".
Note that the equalizing transfer is made lower down in the distribution of income than the disequalizing transfer. This can be seen by the fact the recipient of the first transfer, individual 1, has a lower income than the donor of the second transfer, individual 3, since y3 > y1. For a given distance between recipients and donors, the social improvement effect of equalizing transfers is decreasing in the income of the recipient. Said differently, PigouDalton transfers lose their social improvement effects when recipients are more affluent.
Secondorder indices which respond favorably to such a "favorable composite transfer" obey the transfersensitivity principle and therefore belong to the thirdorder class of indices. Again, such a favorable composite transfer is made of a beneficial PigouDalton transfer within a lower part of the distribution, coupled with a reverse PigouDalton transfer within an upper part of the distribution. Thirdorder welfare indices will increase following this change, and thirdorder poverty and inequality indices will fall. Formally, we have (for a social welfare function W):
Let η > 0 and yj  yi = yi  yk > 2η with yi < yk.
Also let = (y1,..., yi + η,..., yj  η,..., yk  η,..., yi + η,..., yn). Then the social welfare function W obeys the transfersensitivity principle if and only if W (y) ≤ W () for all possible pairings of y and .
Note that the favorable composite transfer considered above involves no change in the variance of the distribution since yj – yi = yl – yk
We can, if we wish, define subsequent classes of indices in an analogous manner. To define fourthorder indices, for instance, we consider a combination of two exactly opposite and symmetric composite transfers, the first one being favorable and occurring within a lower part of the distribution, and the second one being unfavorable and occurring within a higher part of the distribution. The indices that respond favorably to this combination of composite transfers can then belong to the class of fourthorder indices.
As can be seen, higherorder transfer principles essentially postulate that, as the order increases, the relative ethical weight assigned to the effect of income changes occurring at the bottom of the distribution also increases. Thus, as the order s of the class of distributive indices increases, the indices become more and more sensitive to the distribution of income among the poorest. At the limit, as s becomes very large, only the income of the poorest individual matters in comparing poverty and social welfare across two distributions. In that sense, the poverty and social welfare indices become more and more Rawlsian as s increases.
Much of normative welfare economics has been influenced by the philosophical work of Nozick (1974), Rawls (1971) (see Rawls 1974 for a very short synthesis addressed to economists) and Sen (1982). The combined work of Kolm (1976a) and Kolm (1976b) was the first to introduce the transfersensitivity condition into the inequality literature, and Kakwani (1980) subsequently adapted it to poverty measurement. See also Davies and Hoy (1994) (who describe that condition as a Rawlsian extension of the Lorenz criterion), Shorrocks (1987) for a complete characterization of the transfersensitivity principle, and Zheng (1997) for an informative discussion of it. Higherorder principles can be interpreted using the generalized transfer principles of Fishburn and Willig (1984) — see also Blackorby and Donaldson (1978) for a description of these principles as becoming "more Rawlsian". surveys of the normative and axiomatic foundations of modern inequality measurement can be found in Blackorby, Bossert, and Donaldson (1999) and Chakravarty (1999).
Other papers which explore variations to the normative principles typically used in distributive analysis are Mosler and Muliere (1996) (for an alternative principle of transfers), Ok (1995) (for a "fuzzy" measurement of inequality), Ok (1997) (for ranking over opportunity sets), Salas (1998) (for marginal population invariance), Zoli (1999) (for a positional transfer principle when Lorenz curves intersect), and Tam and Zhang (1996) (for an alternative Pareto principle defined in terms of growth over the poor).
Experimental evidence on the normative attitudes of individuals and societies towards the measurement of poverty and equity has also grown fast in the last decades. Methods and results can be found in Amiel and Cowell (1992) (on attitudes to inequality — which question the acceptability of transfer and decomposability principles), Amiel and Cowell (1999) (on attitudes to poverty, social welfare and inequality), Amiel and Cowell (1997) (on attitudes towards poverty measurement), and in Amiel, Creedy, and Hurn (1999) (on quantifying inequality aversion using Okun (1975)'s "leaky bucket experiment"). A survey of such attitudes can be found in Corneo and Gruner (2002).
Fong (2001) tests whether normative attitudes can be explained by selfinterest or by values about distributive justice. Dolan and Robinson (2001) further explore whether there is a "reference point" problem in such studies, and Ravallion and Lokshin (2002) reports that expectations about future levels of wellbeing can influence individuals' desire for redistributive policies.
See also Stodder (1991) for empirical evidence as to why inequality aversion can matter for ranking distributions, and Christiansen and Jansen (1978) for an example of the estimation of social preferences using the revealed structure of an existing tax system (the Norwegian one).
A number of studies have recently also attempted to distinguish between attitudes towards inequality and towards risk aversion: see inter alia Amiel, Cowell, and Polovin (2001), Beck (1994), Cowell and Schokkaert (2001), and Kroll and Davidovitz (2003).
This page intentionally left blank.
To see how the material of Chapter 9 can be used practically to test for the robustness of poverty comparisons, we focus for simplicity on classes of additive poverty indices denoted as IIs (z+), where s stands again for the "ethical order" of the class and where z+ will stand for the upper bound of the range of all of the poverty lines that can reasonably be envisaged. The additive poverty indices P(z) that are members of that class can be expressed as
where z is a poverty line and π(Q(p);z) is an indicator of the poverty status of someone with income Q(p).
We can also think of the function π(Q(p); z) as the contribution of an individual with income Q(p) to overall poverty P(z). Hence, we can also assume that π(Q(p); z) = 0 if Q(p) > z. This ensures that the poverty indices fulfill the wellknown poverty focus principle, which simply states that changes in the incomes of the rich should not affect the poverty measure. The use of quantiles in equation (10.1) also ensures that the poverty indices P(z) obey the anonymity (see page 160) and population invariance principles (see page 160). For expositional simplicity, also assume that π(Q(p);z) is continuously differentiable in Q(p) between 0 and z up to an appropriate order, and denote the ithorder derivative of π(Q(p); z) with respect to Q(p) as π(i)(Q(p);z).
The first class of poverty indices (denoted by II1 (z+)) then regroups all of the poverty indices
that decrease when someone's income increases
and whose poverty line does not exceed z+.
Formally, indices within II1 (z+) are such that:
where the Pareto principle (page 159) appears through the form of a nonpositive firstorder derivative π(1)(Q(p);z).
The second class of poverty indices, II2(z+), contains those firstorder indices that have a greater ethical preference for the poorer among the poor  recall the PigouDalton principle of page 161. Increasing the income of a poorer individual is better for poverty reduction that increasing by the same amount the income of a richer person. The absolute value of the firstorder derivative is therefore decreasing with Q(p), and the indices are thus convex in income. This class II2 (z+) is then:
We will discuss further below the role of the continuity condition π(z, z) = 0. Clearly, II2(z+) ⊂ II1(z+), but not the reverse.
Technically, obeying the "transfersensitivity" principle requires for the P(z) indices that their secondorder derivative π(2)(Q(p);z) be decreasing in Q(p). Poverty indices belonging to the thirdorder class of poverty indices II3(z+ are then defined as:
As before, II3(z+)⊂ II2(z+).
Subsequent classes of poverty indices are defined in an analogous manner. Generally speaking, poverty indices P(z) will be members of class IIs(z+) if (l)s π(s) (Q(p);z) ≤ 0 and if π(i)(z,z) = 0 for i = 0, l, 2..., s  2. As the order s of the class of poverty indices increases, the indices become more and more sensitive to the distribution of income among the poorest. At the limit, and as mentioned above, only the income of the poorest individual matters in comparing poverty across two distributions. Increasing the order s makes us focus on smaller subsets of poverty indices, in the sense that IIs (z+) ⊂ IIsl(z+).
All poverty indices seen in Chapter 5 fit into some of the classes defined above. The poverty headcount F(z) clearly belongs to II1 (z+) (whenever z ≤ z+). As we will see, it also plays a crucial role in tests of firstorder dominance. But it does not belong to the higherorder classes since it is not continuous at the poverty line. The average poverty gap belongs to II1(z+) and to II2(z+), but not to the higherorder classes. The square of the poverty gaps index belongs to II1 (z+), II2(z+) and II3(z+), but not to II4(z+). More generally, the FGT indices, for which π(Q(p); z) = g(p; z)a, belong to IIs(z+) when a α ≥ s  1 and z ≤ z+. The Watts index belongs to IIi(z+) and to II2(z+), but not to II3(z+) since it does not obey the π1(z,z) = 0 restriction. A transformation of the Watts index, by which π(Q(p); z) = g(p; z) [ln(z)  In (Q*(p))], would, however, belong to II3(z+). The Chakravarty and Clark et al. indices belong to II1 (z+) and II2 (z+), and so do as well the SGini indices of poverty.
We can now see how to determine whether poverty in A is greater than in B for all indices that are members of any one of these classes. For this, there exist two approaches: a primal and a dual one. We consider them in turn.
We are interested in whether we may assert confidently that poverty in a distribution A, as measured by PA(z), is larger than poverty in a distribution B, PB(z), for all of the poverty indices P(z) belonging to one of the classes of poverty indices defined above. We are therefore interested in checking whether the following difference in poverty indices ΔP(z) = PA(z)  PB(z) is positive:
where on the second line a change of variable has been effected and where Δf(y) is the difference in the densities of income. To demonstrate the dominance conditions, we will make repetitive use of integration by parts of (10.5). This process will involve the use of stochastic dominance curves Ds(z), for orders of dominance s = 1, 2, 3,.... D1(z) is simply the cdf, F(z), namely, the proportion of individuals underneath the poverty line z. The higher order curves are iteratively defined as
Thus, D2(z) is simply the area underneath the cdf curve for a range of incomes between 0 and z. This is illustrated in Figure 10.1. The curve shows the cdf F(y) at different values of y. The greyshaded area underneath that curve (up to z) thus gives D2(z).
Defined as in (10.6), dominance curves may seem complicated to calculate. Fortunately, there is a very useful link between the dominance curves and the popular FGT indices, a link that greatly facilitates the computation of Ds(z).
We can indeed show that
where c = 1/(s  1)! is a constant that can be basically ignored. Therefore, to compute the dominance curve of order s, we need only compute the FGT index at α = s  1, which is P(z; α = s  1) (see (5.7)). Recall that P(z;α = 1) is the average poverty gap. Hence, the dominance curve of order 2 is simply the average poverty gap at different poverty lines. This can also be seen on Figure 10.1. The distance between z and y gives (when it is positive) the poverty gap at a given value of income y. For y = y1, for instance, Figure 10.1 shows that distance z  y1. dF(y1) — as measured on the vertical axis — gives the density of individuals at that level of income. The rectangular area given by the product of (z  y1) and dF(yl) then shows the contribution of those with income y1 to the population average poverty gap. Integrating all such positive distances between y and z across the population thus amounts to calculating the average poverty gap — again, this is the sum of individual rectangles of lengths (z  y) and heights dF(y), or simply the greyshaded area of Figure 10.1.
Let us now integrate by parts equation (10.5). This gives:
where ΔDs(y) is defined as DAA(y)DsB(y). If we wish to ensure that ΔP(z) is positive for all of the indices that belong to II1(z+), we need to ensure that (10.8) is positive for all of the poverty indices that satisfy the conditions in (10.2), whatever the values of their firstorder derivative π1 (y;z), so long as that derivative is everywhere nonpositive between 0 and z+. For this to hold, we simply need that (recall that D1(y) = F(y))1:
We refer to this as firstorder poverty dominance of B over A. The result can be summarized as follows:
The dominance condition in (10.10) is relatively stringent: it requires the headcount index in A never to be lower than the headcount in B, for all possible poverty lines between 0 and z+. If, however, the condition is found to hold
1DAD: DominancePoverty Dominance.
in practice, a very robust poverty ordering is obtained: we can then unambiguously say that poverty is higher in A than in B for all of the poverty indices in II1(z+) (including the headcount index). Since (almost) all of the poverty indices that have been proposed obey this restriction, this is a very powerful conclusion indeed. Note again that this ordering is valid for any choice of poverty line up to z+.
Moving to secondorder poverty dominance, we integrate equation (10.8) once more by parts and find that:
Recall that the indices that are members of II2(z+) are such that π2(Q(p);z)≥ 0 when Q(p) ≤ z and with π(z,z) = 0. Hence, if we wish ΔP(z) to be positive for all of the indices that belong to II2(z+), we must have that:
This is secondorder poverty dominance of B over A; it can be summarized as:
Secondorder poverty dominance (primal)2:
Recall from 10.7 that D2(z) = P(z; α = 1). Secondorder poverty dominance thus requires the average poverty gap in A to be always larger than the average poverty gap in B, for all of the poverty lines between 0 and z+. If the condition in (10.13) is found to hold in practice, then we can say that poverty is higher in A than in B for all of the poverty indices that are continuous at the poverty line and that are equality preferring (their secondorder derivative is positive). That, of course, also includes the average poverty gap itself. Most of the indices found in the literature fall into that category, a major exception being the headcount and the Sen index. And that ordering is again valid for any choice of poverty line between 0 and z+.
We can repeat this process for any arbitrarily higher order of dominance, by successive integration by parts and by determining the conditions under which all of the poverty indices P(z) that are members of a class IIs (z+) will indicate more poverty in A than in B, and this for all of the poverty lines z between 0 and z+. This gives the following general formulation of sth order poverty dominance:
2DAD: DominancePoverty Dominance.
sthorder poverty dominance (primal)3:
This condition is illustrated in Figure 10.2 for general sorder dominance, where dominance holds until z+, but would not hold if z+ exceeded zs. Checking poverty dominance is thus conceptually straightforward. For firstorder dominance, we use what has been termed "the poverty incidence curve", which is the headcount index as a function of the range of poverty lines [0, z+]. For secondorder dominance, we use the "poverty deficit curve", which is the area underneath the poverty incidence curve or more simply the average poverty gap, again as a function of the range of poverty lines [0, z+]. Thirdorder dominance makes use of the area underneath the poverty deficit curve, or the squareofpovertygaps index (also called the poverty severity curve) for poverty lines between 0 and z+. Dominance curves for greater orders of dominance simply aggregate greater powers of poverty gaps, graphed against the same range of poverty lines [0, z+].
The condition (10.13) for secondorder dominance is less stringent than (10.10) for firstorder poverty dominance. To see why, consider (10.6) again. When firstorder dominance over [0, z+] holds, then secondorder dominance over [0, z+] must also hold. Hence, when we find that the poverty indices in II1(z+) show more poverty in A, we also know that the poverty indices in II2(z+) will do the same. That is of course consistent with the fact that II2(z+) ⊂ II1(z+}.
Suppose, however, that we have that ΔD2(y) > 0 for all y ∈ [0, z+], but not that ΔD1(y) > 0 for all y ∈ [0, z+}. Hence, we have firstorder, but not secondorder, dominance. Poverty is larger in A for all of the indices in II2(z+) but not for all those in II1(z+). This is possible since II1(z+) is a larger set than II2(z+).
These relationships are in fact sequentially valid for higher orders as well. This is illustrated in Figures 10.3 and 10.4. Figure 10.3 shows that a class of indices Hs+l(z+) is a subset of the lowerclass of indices IIs(z+). Whenever an ordering is made over IIs(z+), it is also necessarily valid over the subset IIs+1(z+). Figure 10.4 analogously illustrates the size of the sets of distributions (A, B) that can be ordered by the dominance condition in (10.14). The greater the value of s, the more likely can a couple (A, B) fall into those sets, and therefore the more likely can they be compared unambiguously by that
3DAD: DominancePoverty Dominance.
dominance condition. Taken jointly, Figures 10.3 and 10.4 show the tradeoff that exists between wishing to assert whether A really has more poverty than B (Figure 10.4), and wishing to assert this for as large a class of poverty indices and poverty lines as possible (Figure 10.3).
For a simple illustration of these relationships, consider a comparison of distributions A and B in Table 9.1 on page 156. The firstorder dominance condition (10.10) only holds if z+ is lower than 9. Hence, we can conclude that A has more poverty than B for any choice of firstorder indices so long as the poverty line is less than 9. Indeed, it is not hard to find some firstorder indices that will show more poverty in B when z exceeds 9: the headcount between 9 and 11 clearly shows more poverty in B. We can, however, verify that the secondorder condition is obeyed for any choice of z+. This then implies that all secondorder indices (those that are members of II2(z+)) will show more poverty in A, regardless of the choice of poverty line. This is quite a robust statement, since it is valid for all distributionsensitive poverty indices (the headcount is not distributionsensitive, hence it does not always indicate more poverty in A) and again for any choice of poverty line. Again, as mentioned above, secondorder poverty dominance is a criterion that is less stringent to check in practice than firstorder dominance. The price of this, however, is that the set of indices over which poverty dominance is checked is smaller for secondorder dominance than for firstorder dominance.
There exists a dual approach to testing firstorder and secondorder poverty dominance, which is sometimes called a p, percentile, or quantile approach. Whereas the primal approach makes use of curves that censor the population's income at varying poverty lines, the dual approach makes use of curves that truncate the population at varying percentile values. The dual approach has interesting graphical properties, which makes it useful and informative in checking poverty dominance.
To illustrate this second approach, we focus on indices that aggregate poverty gaps using weights that are functions of p:
Note that using aggregates of poverty gaps as in (10.15) is more restrictive than using functions π(Q(p); z) defined separately over Q(p) and z, as is done in (10.1). When the poverty lines are the same across distributions (as was implicitly assumed above for the primal approach, and as is almost always assumed to be the case in practice), the dominance rankings are, however, the same for the two approaches, as we will see below.
Membership in the (dual) firstorder class II1(z+) of poverty indices only requires that the weights ω(p) be nonnegative functions of p:
If we want to check whether ΔΓ(z) = ΓA(z)  FB(z) is positive for all of the indices that belong to Ill(z+), we need only assess whether gA(p;z+) ≥ gB(p;z+) for all p ∈ [0,1]. This yields the following dual firstorder poverty dominance:
Firstorder poverty dominance (dual)4:
Condition (10.17) requires poverty gaps to be nowhere lower in A than in B, whatever the percentiles p considered. It thus amounts to ordering the poverty gap curves. It is not difficult to show that this is also equivalent to checking the primal firstorder poverty dominance condition in (10.10). In other words, if we can order poverty over II1 (z+), then we can also do so overII1(z+), and vice versa. In fact, firstorder poverty dominance (primal or dual) implies ordering all poverty indices (additive or otherwise) that are (weakly) decreasing in income. To check for such a wide degree of ethical robustness, we can use either the primal or the dual firstorder poverty dominance condition.
The following conditions are equivalent:
1 Poverty is higher in A than in B for any of the poverty indices that obey the focus (see p.165), the anonymity (p.160), the population invariance (p.160) and the Pareto (p.159) principles and for any choice of poverty line between 0 and z+;
2 PA(z;α = 0) ≥ PB(z;α = 0) for all z between 0 and z+;
3 gA(p;z+) ≥ gB(p;z+) for all p between 0 and 1.
Membership in the dual secondorder class II2(z) of poverty indices requires that the weights ω(p) be positive and decreasing functions of the ranks
4DAD: CurvesPoverty Gap.
p:
To show what dominance condition applies to (10.18), recall that G(p;z) is the Cumulative Poverty Gap (CPG) curve, and integrate by parts (10.15):
For ΔΓ(z) to be positive for all of the indices that belong to II2(z+) (and therefore also for all poverty lines z ≤ z+), we need to order the CPG curves. The result is summarized as:
Secondorder poverty dominance (dual)5:
Again, we can show that the condition in (10.20) is equivalent to the primal secondorder poverty dominance condition in (10.13). In other words, if and only if ΓA(z)  ΓB(Z) ≥ 0 for all p(z) ∈ II2(z+), then PA(z)  PB(z) > 0 for all P(z) e II2(2+). Thus, to check robustness of poverty ordering over all distributionsensitive poverty indices, we can use either the primal or the dual secondorder poverty dominance condition. This is summarized as follows:
The following conditions are equivalent:
1 Poverty is higher in A than in B for any of the poverty indices that obey the focus (see p.165), the anonymity (p.160), the population invariance (p. 160), the Pareto (p. 159) and the PigouDalton (p. 161) principles and for any choice of poverty line between 0 and z+;
2 PA(z; α = 1) ≥ PB(z; α = 1) for all z between 0 and z+;
3 GA(p; z+) ≥ GB(p; z+) for all p between 0 and 1.
Dual conditions for higherorder poverty dominance are not as convenient and simple as those just stated for first and secondorder dominance. It is therefore usual to check higherorder dominance using the primal conditions of (10.14). Stated in terms of ethical principles, thirdorder dominance reads for instance as:
5DAD: CurvesCPG.
The following conditions are equivalent:
1 Poverty is higher in A than in B for any of the poverty indices that obey the focus (see p.165), the anonymity (p.160), the population invariance (p.160), the Pareto (p.159), the PigouDalton (p.161) and the transfersensitivity principles (p.161) and for any choice of poverty line between 0 and z+;
2 PA(Z; α = 2) ≥ PB(Z; α = 2) for all z between 0 and z+.
Whether we use the primal or the dual approach, testing for poverty dominance involves specifying an upper bound z+ for the ordering of the dominance curves. This bound can presumably be obtained from empirical or ethical work on what reasonable range of poverty lines should be used to compare poverty. It can of course also be specified arbitrarily by the researcher. An alternative strategy is to use the available sample information and estimate directly from that information the upper bound up to which a distributive comparison can be inferred to be robust. We can then interpret this statistics as a "critical" bound. In the light of the results above, this critical bound will limit the range of poverty lines over which we will be able to order poverty across A and B.
Assume for instance that a primal poverty dominance curve DsA(y) for A is initially higher than that for B for low values of y, but that this ranking is reversed for higher values of y. Let be the first crossing point of the curves, such that Distribution B then has less poverty than distribution A for all of the poverty indices in IIs(z+), so long as As the notation implies, this calculation can be done for any desired order s of poverty dominance.
It may be, however, that we feel (for some order s) that is too low. Said differently, being able to order poverty only over a relatively narrow range may seem unsatisfactory. We may change this by moving to a higher order of dominance. Indeed, we can show that is increasing in s, with , whenever for some and for all . We may thus increase the range of poverty lines over which a poverty ranking is robust by moving up to a higher class of indices.
This is illustrated in Figure 10.5, where z+ < z++. For the sake of illustration, suppose that the firstorder dominance curves of A and B cross somewhere between 0 and z++. It is then impossible to order poverty over all of the indices that belong to IIl(z++). Assume, however, that decreasing the upper bound from z++ to z+ does rank the distributions over II1(z+), and that increasing the order of dominance from 1 to 2 while maintaining the upper bound at z++ also ranks the distributions in terms of poverty. In either case, poverty is now ordered, but over different sets. The alternative is then to choose between an ordering on indices that are ethically more restrictive (such as II2(z++)), and an ordering on indices with a more restrictive range of poverty lines (such as II1 (z+)).
Methods for testing poverty dominance are relatively recent, and postdate much of the literature on inequality and social welfare dominance. One of the early influential papers is Atkinson (1987) — that paper also introduced the idea of "restricted" dominance. The theoretical poverty dominance conditions have been further and rigorously explored in Foster and Shorrocks (1988b) and Foster and Shorrocks (1988c). Bounds to poverty dominance are discussed in Davidson and Duclos (2000). Zheng (2000a) provides a different approach based on "minimum distributionsensitivity" poverty indices.
The PigouDalton principle has been framed alternatively as a strong and as a weak axiom for the study of poverty indices (see Donaldson and Weymark (1986) and Zheng (1999a)). In the weak version, the axiom says that the poverty index must increase following a transfer from one individual to another wealthier individual, providing that both are initially below the poverty line and that the transfer does not lift the wealthier person above this threshold. The strong axiom postulates that the index must increase even if this transfer pushes the higherincome recipient above the poverty line. The strong formulation of the axiom is usually preferred.
Del Rio and Ruiz Castillo (2001), Jenkins and Lambert (1998a), Jenkins and Lambert (1998a), Jenkins and Lambert (1998b) and Spencer and Fisher (1992) discuss the use of CPG (or "TIP") curves (initially proposed by Jenkins and Lambert 1997) for secondorder poverty dominance, surveys and integrative reviews of the literature can be found in Zheng (1999a), Zheng (2000b) and Zheng (2001a). US applications include Bishop, Formby, and Zeager (1996) (for the marginal impact of food stamps on US poverty) and Zheng, Gushing, and Chow (1995) (for another US application).
As for poverty, we may wish to determine if the ranking of two distributions of income in terms of social welfare is robust to the choice of social welfare indices. Of course, one way to check such robustness would be to verify the welfare ranking of the two distributions for a large number of the many social welfare indices that have been proposed in the literature. This, however, would certainly be a tedious task. Besides, new social welfare indices can always be designed.
A simpler and more powerful alternative is to apply tests of welfare dominance. Unlike for poverty, welfare dominance tests take into account the whole distributions of income, as opposed to just the censored distributions used for poverty comparisons.
As for poverty dominance, there are two testing approaches, a primal (incomecensoring) and a dual (percentiletruncating) one. The primal approach has the advantage of being applicable to any desired (however large) order of dominance, and uses curves of the wellknown FGT indices for an infinite range of "poverty lines" or income censoring points. The dual approach is practically convenient only for first and second order dominance, but it uses curves that are graphically instructive and that have been documented extensively in the literature. As for poverty dominance, if, for first and second order dominance, a welfare ranking is obtained using one of these two testing approaches, the same ranking will be obtained using the other approach. In other words, the two approaches are equivalent in terms of their ability to rank distributions robustly over classes of first and secondorder social welfare indices.
As for poverty dominance, for both of these approaches we will make use of classes of social welfare indices defined by the reactions of indices to changes in or reallocations of income. These social welfare indices do not need to be additive, but for expositional convenience assume that they are defined in the simple rankdependent utilitarian format of W in (4.28):
The firstorder class of social welfare indices regroups all of the symmetric (or anonymous) social welfare indices that are increasing in income. In terms of (11.1), this can be formulated as the class Ω1 with
The secondorder class of social welfare indices regroups all of the firstorder indices that are increasing in meanpreserving equalizing transfers. Recall that such transfers redistribute one dollar of income from a richer to a poorer person. These indices thus obey the PigouDalton principle of transfers. Using (11.1) again, this suggests the class of Ω2 indices:
The thirdorder class of social welfare indices includes all of the secondorder indices that further obey the transfersensitivity principle — requiring that equalizing transfers have a greater impact on social welfare when they occur lower down in the distribution of income. Expressed in terms of (11.1), this requirement forces ω(p) to be a constant and requires the concavity of individual utility functions to be decreasing in income. This suggests Ω3:
As hinted above on page 162, higher orders of classes can be defined analogously. Generally speaking, membership in a higherorder class of social welfare indices requires these indices to be more sensitive to the income of the very poor. Membership in Ωs implies membership in Ωs1, and for sorder additive welfare indices, we also need that (1)(i)U(i)(Q(p)) ≤ 0 for i=1,..., s.
As for poverty dominance, both primal and dual conditions can be used for testing first and secondorder welfare dominance. The two types of tests order social welfare on exactly the same class of indices.
Firstorder welfare dominance
The following conditions are equivalent:
1 Social we!lfare is larger in B than in A for any of the social welfare indices that obey the Pareto (p.159), the anonymity (p.160) and the population invariance principles (p.160);
2 WB  WA ≥ 0 for all W ε Ω1;
3 PA(z;α = 0) ≥ PB(z; α = 0) for all z between 0 and infinity;
4 for all z between 0 and infinity;
5 QA(p) ≤ QB(p) for all p between 0 and 11.
Firstorder welfare dominance can thus be checked by verifying whether the headcount index is higher for A than for B for all poverty lines z. There is therefore a useful analogue between tests of poverty and welfare dominance. Ordering two distributions of incomes over the firstorder class of social welfare indices can also be done by comparing the incomes of the two distributions over the entire range of percentiles. Graphically, it requires checking that "Pen's parade of dwarfs and giants" be everywhere higher in B than in A, whatever the percentiles being compared. The two distributions "parade" simultaneously alongside each other, and the distributive analyst observes if one parade dominates everywhere the other.
A similar result can be stated for secondorder welfare dominance. To see this, first recall the definition of the Generalized Lorenz curve GL(p) (see (4.44) on page 65):
The Generalized Lorenz curve sums all incomes up to quantile Q(p), and is therefore the cumulative Pen's parade. We then obtain:
Secondorder welfare dominance
The following conditions are equivalent:
1 Social welfare is larger in B than in A for any of the social welfare indices that obey the Pareto (p.159), the anonymity (p.160), the population invariance (p.160) and the PigouDalton (p.161) principles;
2 WB  WA ≥ 0 for all W ε Ω2;
3 PA(z;α = 1) ≥ PB(z; α = 1) for all z between 0 and infinity;
1DAD: CurvesQuantile.
4 for all z between 0 and infinity;
5 GLA(p) ≤ GLB(p) for all p between 0 and 1 2.
An exactly similar result applies for higherorder welfare dominance. As for poverty dominance, the dual conditions are less convenient and are omitted here.
Higherorder welfare dominance
The following conditions are equivalent:
1 WB  WA ≥ 0 for all W ε Ωs;
2 PA(z;α = s  1) ≥ PB(z; α = s  1) for all z between 0 and infinity;
3 for all z between 0 and infinity.
Checking for sorder welfare dominance thus simply requires comparing the FGT indices for α = s  1 over all possible poverty lines.
As for poverty and welfare dominance, we can define classes of relative inequality indices over which to check the robustness of the inequality orderings of two distributions of income. As we will see, these classes of inequality indices have properties which are analogous to those of the classes of social welfare indices. They react to income changes or income reallocations in a manner that depends on the order of the classes to which the indices belong. Unlike social welfare functions, however, relative inequality indices also need to be homogeneous of degree 0 in all income. This means that an equiproportionate change in all incomes will not affect the value of these relative inequality indices.
Consider first the class 1(1+) of inequality indices of the firstorder. Recall that income shares (or normalized quantiles) are given by . 1(1+)is a class of inequality indices that is not usually considered in the literature because it censors at l+ the effects of changes affecting income shares. Indeed, besides being homogeneous of degree 0 in income, the indices that are members of 1(l+) are such that, for a given mean, inequality decreases when an individual's income increases, so long as that individual's income share does not exceed l+. Said differently, the inequality indices in 1(l+) are decreasing in the income shares of those with If the income of an individual with income share greater than l+ changes, then an index that is a member of 1(l+) cannot change. We can think of keeping mean income constant, following these changes, through a decrease in the income of those individuals
2DAD: Curves Generalized Lorenz.
with income shares exceeding l+, since that will not by definition affect the firstorder inequality indices. In addition to being symmetric in income, these indices are therefore in some loose sense of the Pareto type.
The Pareto principle underlying 1(l+) is thus an alternative ethical principle to the wellknown PigouDalton principle of transfers, which has been at the heart of inequality analysis for several decades. But the scope of this Pareto principle is censored: it only applies to income shares below l+. This makes the firstorder class of inequality indices a povertylike class. For this reason, we will not have that 2 ⊂ 1(l+)
The PigouDalton principle will postulate that a meanpreserving transfer of income from a higherincome person to a lowerincome person decreases inequality, whatever the income shares of those affected by this income reallocation. All of the inequality indices that belong to the class 2 of secondorder inequality indices obey this principle and decrease after a meanpreserving equalizing transfer. These inequality indices are also said to be Schurconvex. Almost all of the frequently used inequality indices (including the Atkinson, SGini and Generalized entropy indices, with the notable exception of the variance of logarithms) are members of 2.
Those inequality indices that belong to the class 3 of thirdorder inequality indices also belong to 2, and weakly decrease after a favorable composite transfer. This includes the Atkinson indices and some of the Generalized entropy indices, but not the SGini indices, classes s of higher order inequality indices can be similarly defined. For instance, to be members of the class of fourthorder inequality indices, inequality indices must be members of 3 and must be more sensitive to favorable composite transfers when they take place lower down in the distribution of income. Again, all of the Atkinson indices belong to 4. The higher the value of s, the more Rawlsian are the indices since the more sensitive they are to the income shares of the poorest.
Comparing the definitions of the classes Ωs and s, note that when the means of the distributions are equal, the social welfare ranking is the same as the inequality ranking, in the sense that if IA ≥ IB for all I in Ωs, then WA ≤ WB for all W in Ωs, and vice versa. In such cases, checking for inequality dominance can be done by checking for welfare dominance. When the means are not equal, we can normalize all incomes by their mean (this does not affect relative inequality), and then use the welfare dominance results described in Section 1 1.2 for Ωs to check for dominance over a class sof relative inequality indices. Hence, to check for inequality dominance, we can simply test for welfare dominance once incomes have been normalized by their mean. When B has more welfare than A at order s, we can say that IB is lower than IA for all of the inequality indices that belong to s.
As indicated above, checking for inequality dominance can be done most easily by using the welfare dominance conditions of Section 11.2 and normalizing incomes by their mean. For the primal dominance curves, we will thus need the normalized stochastic dominance curve , defined as
is nicely linked to the normalized FGT indices :
where c is as before a constant that we can ignore. Thus, estimating the normalized dominance curve at lμ and order s is equivalent to computing the normalized FGT index for a poverty line equal to lμ and for α equal to s  1. Similarly, for dual dominance conditions, we may use the poverty gaps normalized by mean income:
This leads to:
Firstorder restricted inequality dominance
The following conditions are equivalent3:
1 IA  IB ≥ 0 for all I in 1(l+);
2 for all λ between 0 and l+;
3 for all p between 0 and 1.
Note that the condition is easily interpreted. It simply compares the proportion of those with income less than l times the mean in A and in B. If there are fewer such individuals in B than in A, for all l ≤ l+, inequality is greater in A for all of the indices in 1(l+).
Secondorder inequality dominance
The following conditions are equivalent4:
3 DAD: Dominance]Inequality Dominance.
4 DAD: DominanceInequality Dominance.
1 Relative inequality is larger in B than in A for any of the inequality indices that obey the anonymity (p. 160), the population invariance (p. 160), and the PigouDalton (p. 161) principles;
2 IA  IB ≥ 0 for all I ε 2;
3 for all λ between 0 and infinity;
4 for all λ between 0 and infinity;
5 LA(p) ≤ LB(p) for all p between 0 and 1.
Testing for secondorder inequality dominance can thus be done simply by comparing the usual normalized average poverty gap for A and for B for all possible proportions of the mean as poverty lines. An alternative equivalent test is that of comparing the Lorenz curves for A and B. This is the wellknown and classical Lorenz test, which has long been considered the golden rule of relative inequality rankings. Dual conditions for higherorder (i.e., thirdorder) inequality dominance have also been proposed in the literature, but they are again less convenient to use than the primal conditions.
A general sorder inequality dominance condition is then simply stated as:
sorder inequality dominance
The following conditions are equivalent 5:
1 IA  IB ≥ 0 for all I ε s;
2 for all λ between 0 and infinity;
3 for all λ between 0 and infinity.
We can combine some of the results derived above to what we saw in Chapter 7 on the measurement of vertical equity in order to link progressivity and inequality dominance.
First, in the absence of reranking, it is clear that a tax and/or a transfer that is TR or IRprogressive, will decrease all of the inequality indices that are members of 2. This is most easily seen by considering equations (8.11) and (8.13) and by noting that CN(p) = LN(p) when there is no reranking. For IR progressivity, this follows from the fact that a concentration curve for net income that lies above the Lorenz curve of gross income pushes the Lorenz
5 DAD: DominanceJInequality Dominance.
curve of net income upward. This decreases inequality for all secondorder indices of inequality.
Further, again in the absence of reranking, if a tax and/or transfer T1 is more IRprogressive than a tax and/or transfer T2 then T1 necessarily reduces inequality by more than T2 when inequality is measured by any of the inequality indices that belong to 2. This can be seen by the sum of the IR progressivity terms in (8.15) (see also equation (8.11)) and by noting again that CN(p) = LN(p) in the absence of reranking.
We may also be concerned about the impact of a tax and benefit system on the class of firstorder inequality indices, viz, on those indices that are monotonic in some lower income shares, but not always monotonic in cumulative income shares. To check whether this impact reduces firstorder inequality indices, we must check whether is always lower than μT/μX for all of the X that are below some censoring point l+μ. This supposes again, however, that the tax does not induce reranking. When it does, one way to account for the reranking effect is to compute "income growth curves", which are given by (N(p)  X(p))/X(p). (We will return to these curves in the context of the discussion of propoor growth in Section 11.8.) When these curves exceed the growth in average income — given by (μN  μX)/μX — for all p ≤ FX(l+μX), then all of the firstorder inequality indices in 1(l+) will fall.
It often occurs that two income distributions A and B are compared using estimates of average income and inequality separately. Using secondorder dual conditions, it is straightforward to combine these estimates to assess whether social welfare is greater in A than in B by noting from (4.44) that GL(p) = μL(p).
Say that we dispose of the entire Lorenz curves of each of the two distributions. Figure 11.1 shows four cases of comparisons of average income and inequality across these two distributions. In Case 1, A Lorenzdominates B, and it also has a higher average income. Hence, there is generalizedLorenzdominance of A over B, and we are therefore assured that WA  WB ≥ 0 for all W ε Ω2. In Case 2, A also dominates B according to the Lorenz criterion, but μA < μB; because of this, GLA(P) crosses GLB(p) and there can be no unambiguous secondorder social welfare ranking. μA ≥ μB is indeed a necessary condition for welfare dominance of A over B for any order of dominance. Comparing the slopes of each of these two curves gives, however, the quantiles at various percentiles p. Since these quantiles are visibly larger in A than in B for a large lower range of p, A has less poverty than B for a large range of possible poverty lines and for many poverty indices. Case 3 depicts an ambiguous ranking of inequality across A and B. However, because μA is well above μB the Generalized Lorenz curve for A is above that for B. Finally, Case 4 shows a circumstance in which inequality and social welfare rankings clash. A has unambiguously less inequality than B according to the Lorenz criterion, but μA being significantly below μB has unambiguously less social welfare than B according to the generalizedLorenz criterion and to secondorder social welfare dominance.
The impact of government benefits and transfers on the distribution of incomes can also be visualized using curves that are linked to the poverty, social welfare and inequality dominance curves.
Say, for instance, that the expected benefit at rank p of some government program — or some economic change — is given by (This could be estimated nonparametrically.) An impact indicator of the cumulative effect of that benefit up to rank p is given by:
with μB = GCB(1). In analogy to the Generalized Lorenz curve, we may call GCB(p) a Generalized concentration curve. GCB(p) shows approximately the absolute contribution of the bottom proportion p of the population to the per capita benefits. The impact GCB(p) is only approximate since it ignores the possible reranking of individuals by the program. The concentration curve of the benefit up to rank p can then be defined as:
Recall that the concentration curve CB(p) at p gives the percentage of the total benefits that accrue to those with initial rank p or lower. Using CB(p) and GCB(p) can help assess the distributive effect of the program. For instance:
1 For understanding the approximate impact of the benefit on social welfare, we may wish to test whether is always positive, regardless of p. If so, then the benefit will tend to increase social welfare for all firstorder welfare indices. If not, we can test if GCB(p) is always positive regardless of p. If so, then the approximate impact of the benefit is to increase social welfare for all secondorder welfare indices.
2 For understanding the approximate impact of the benefit on poverty, we proceed basically as in point 1 just above, with the only difference that we assess the curves and GCB(p) only for all p ε [0,F(z+)]. If is always positive over that range of p, then the benefit will tend to decrease poverty for all firstorder poverty indices Π1 (z+), and if GCB(p) is always positive for all p ε [0,F(z+)], then the approximate impact of the benefit is to decrease poverty for all poverty indices in Π2(z+).
3 For assessing the impact of the benefit on inequality and relative poverty, we may compare with X(p)/μX, and CB(p) with LX(p). Comparing with X(p)/μX sheds light on the approximate impact of the benefit on firstorder inequality indices, whereas comparing CB(p) with L(p) shows the approximate impact of the benefit on secondorder inequality indices. We compare these curves for all p ε [0, 1] if we are concerned about the whole population, for all p ε [0,F(z+)] if we are only concerned about the poor, or for all p ε [0, F(l+μ)] if we are concerned about firstorder inequality indices.
Assessing whether distributional changes are "propoor" has become increasingly widespread in academic and policy circles. We will see that it is relatively straightforward to use the tools developed above to make such an assessment. There are, however, two important issues that we must first discuss.
The first issue is whether our propoor standard should be absolute or relative. This is equivalent to asking whether we should be interested in the impact of growth on absolute poverty or on relative inequality. It is indeed important to distinguish between expectations that growth should change the incomes of the poor by the same absolute or by the same proportional amount — these expectations are conceptually not the same, and their empirical realization also varies significantly.
The second issue is whether propoor judgements should put relatively more emphasis on the impact of growth upon the poorer of the poor. This is equivalent to deciding whether our propoor judgements should obey higherorder ethical principles such as the PigouDalton principle. We will consider two orders of propoor judgements: the first will obey the focus, the anonymity and the Pareto principles, and the second will also obey the PigouDalton principle.
Let a distributive change entail a movement from a distribution X(p) to a distribution N(p), Let "income growth curves" be defined as the proportional change in income observed at various percentiles 6:
If the incomegrowth curve is positive everywhere over p ε [0, 1], then it is clear from the firstorder welfare dominance results of page 183 that the change increases social welfare for all of the welfare indices that belong to Ω1. It is also clear from the firstorder poverty dominance results of page 174 that the change decreases poverty for all of the poverty indices that belong to Π1(∞) (and thus for all those that obey the firstorder — focus, Pareto and anonymity — ethical principles). This result is valid for any choice of poverty lines.
A test with a greater chance to succeed is to check whether the incomegrowth curve is positive everywhere over p ε [0,FX(z+)]. If so, then the distributive change decreases poverty for all poverty indices P(z) that belong to Πl(z+). In such circumstances, the change can be called "absolutely propoor", in the sense that the poor benefit in absolute terms from the distributive change. We then have:
Firstorder absolute propoor judgements
The following statements are equivalent:
1 A movement from X to N is firstorder absolutely propoor for all choices of poverty lines between 0 and z+;
2 Poverty is higher in X than in N for all of the poverty indices that obey the focus (p. 165), the population invariance (p. 160), the anonymity (p. 160) and the Pareto (p. 159) principles and for any choice of poverty line between 0 and z+;
3 PX(z;α = 0) ≥ PN(Z; α = 0) for all z between 0 and z+;
4 g(p) ≥ 0 for all p between 0 and FX(z+).
Income growth curves can also be used to test whether a distributive change is "relatively propoor", in the sense that the change increases the incomes of the poor at a faster rate than that of the incomes of the rest of the population. For that purpose, we only need to compare the income growth curve g(p) at various percentiles to the growth in mean income. If the income growth curve at all p ε [0,F(z+)] is higher than the growth in mean income, then the change can be said to be firstorder relatively propoor. An exactly equivalent test can be done by comparing the normalized quantiles for the initial and posterior
6 DAD: CurvesProPoor.
incomes — recall that normalized quantiles are just incomes as a proportion of mean income. If the normalized quantiles of the poor are increased by the change, then the change is firstorder relatively propoor. We thus have:
Firstorder relative propoor judgements
The following statements are equivalent:
1 A movement from X to N is firstorder relatively propoor for all choices of poverty lines between 0 and z+;
2 for all p between 0 and FX(z+);
3 QX(p)/μX ≤ QN(p)/μN for all p between 0 and FX(z+);
4 IX  IN ≥ 0 for all I in Πl(z+/μX);
5 FX(λμN) ≥ FN (λμN) for all λ between 0 and z+/μX
Testing for firstorder propoor judgements can be demanding. It requires all quantiles of the poor to undergo a rate of growth that is either positive (for absolute judgements) or at least as large as the growth rate in average income (for relative judgements). We may want to relax this on the basis that a large rate of growth for the poorer among the poor may sometimes be deemed ethically sufficient to offset a low rate of growth for some percentiles of the notsopoor. This therefore says that propoor judgements could give greater weight to the growth experience of the poorer among the poor. Implementing this is done by forcing propoor judgements to obey the PigouDalton principie.
Secondorder absolute propoor judgements
The following statements are equivalent:
1 A movement from X to N is secondorder absolutely propoor for all choices of poverty lines between 0 and z+;
2 Poverty is higher in X than in N for all of the poverty indices that obey the focus (p. 165), the anonymity (p.160), the population invariance (p. 160), the Pareto (p. 159) and the PigouDalton principles (p. 161) and for any choice of poverty line between 0 and z+;
3 PX(z; α = 1) ≥ PN(z;α = 1) for all z between 0 and z+;
4 GN(p; z+) ≤ GX(p; z+) for all p ε [0, 1].
Recall that the cumulative income up to rank p is given by the Generalized Lorenz curve. Denote its proportional change by 7
A sufficient condition for a secondorder absolute propoor change is then that the growth in cumulative incomes be positive:
As for firstorder propoor judgements, we may wish secondorder judgements to require that the incomes of the poor at least keep up with those of the rest of the population. This yields:
Secondorder relative propoor judgements
The following statements are equivalent:
1 A movement from X to N is secondorder relatively propoor for all choices of poverty lines between 0 and z+;
2 for all λ between 0 and z+/μX
If the above conditions hold for z+ = ∞, then the change also reduces all of the inequality indices that are members of 2. From the Theorem on secondorder inequality dominance of page 186, this is therefore also equivalent to checking whether the Lorenz curve is pushed up by the distributive change.
A sufficient condition for secondorder relative propoorness can also be implemented by comparing the growth in the cumulative incomes of the poor to the growth in average income. If, for all p lower than F(z+), the percentage growth in the cumulative incomes of a bottom proportion p of the population is larger than the percentage growth in mean income, then the change can be said to be secondorder relatively propoor:
Income growth curves and cumulative income growth curves may also be used to assess the impact of a distributive change on relative poverty. The procedure is similar to that of checking whether the change is propoor — we compare income growth for the poor to the growth of some central tendency of the income distribution. One difference with the measurement of propoor growth is that the central tendency of interest may be some quantile (such as median income) if the relative poverty line is set as a proportion of that quantile
7 DAD: CurvesProPoor.
Methods for establishing inequality dominance surprisingly predate those for establishing welfare dominance in welfare economics. The seminal works are those by Atkinson (1970), Dasgupta, Sen, and Starret (1973) and Kolm (1969) for inequality dominance, and Shorrocks (1983) and Foster and Shorrocks (1988c) for welfare dominance. Foster and Shorrocks (1988a) explore the links between relative poverty and relative inequality dominance (see also Davidson and Duclos 2000 and Formby, Smith, and Zheng 1999). Welfare economists have made extensive use of the literature on the ranking of distributions under risk aversion — see among many others Fishburn and Vickson (1978), Pratt (1964), Whitmore (1970) and Yitzhaki (1982b).
Descriptions and theoretical foundations of dual stochastic dominance tools can be found inter alia in Pen's parade of "dwarfs and giants" (Pen 1971, Chapter 3), in Yaari (1987), in Moyes (1999) (for links with Lorenz curves), and in Davies and Hoy (1995) and Muliere and Scarsini (1989) (for when Lorenz curves intersect).
Empirical tests for inequality and welfare dominance are numerous; they include inter alia Bishop, Formby, and Smith (1991d) (Lorenz dominance in the US), Bishop, Chow, and Formby (1991b) (firstorder and truncated dominance), Bishop, Formby, and Thistle (1991e) (Pen or "rank" dominance), Bishop, Formby, and Smith (1991c) (Lorenz dominance across 9 countries), Bishop, Formby, and Thistle (1992) (convergence of US regional distributions), Bishop, Formby, and Smith (1993) (welfare and inequality dominance using LIS data), Chen, Datt, and Ravallion (1994) (comparisons of 44 less developed countries), Gouveia and Tavares (1995) (Portuguese distributions), Makdissi and Groleau (2002) (Canadian distributions), Ravallion (1992) (Indonesia), Sahn and Stifel (2002) (applied to nutritional data), and Wang, Shi, and Zheng (2002) (comparing inequality and social welfare in China).
Numerous methods and indices have been proposed recently for assessing whether distributive changes are propoor. See, for instance, McCulloch and Baulch (1999) for the difference between a postchange poverty headcount with that headcount which would have occurred if all had gained equally; Kakwani, Khandker, and Son (2003) for a "poverty equivalent growth rate" which computes (using estimates of poverty elasticities) the growth rate that would have been needed to achieve some poverty change without a change in the distribution of relative incomes, and then compares that growth rate to the growth rate in mean income; Kakwani and Pernia (2000) for a propoor index given by the ratio of the actual change in poverty over the change that would have been observed under distributional neutrality, and then compares its value to 1; Dollar and Kraay (2002) for a comparison of the growth rate in average income to the growth rate in the incomes of the lowest quintile; Ravallion and Chen (2003) for a comparison of the growth rate in average income to a "population weighted" average growth rate of the initially poor percentiles of the population — this can also be done using the area underneath the income growth curve g(p) defined in (11.11); Klasen (2003) for a comparison of the growth rate in average income to "population" and "poverty weighted" average growth rates; EssamaNssah (2004) for the use of an ethicallyflexible weighted average of individual growth rates that does not make use of poverty lines; Datt and Ravallion (2002) for an example of the popular use of growth elasticities of poverty measures; and Son (2004) for a "poverty growth curve" that displays the growth rate in the mean income of a bottom proportion p of the population — the cumulative income growth curve G(p) of (11.12) — and compares it to the growth rate in mean income.
This page intentionally left blank.
For policy purposes, it is often as useful to assess the impact of reforms to a benefit or public expenditure program as it is to evaluate the effect of existing programs. For administrative or political reasons, it may indeed be impossible to eliminate or to amend dramatically the structure of existing programs. Hence, comparing a current tax or benefit program with a situation in which it is supposed not to exist may not be very useful for practical purposes. Marginal reforms to such programs are nevertheless often feasible, and we therefore focus on them in this chapter. As we will see, focusing on marginal reforms also has the advantage of making it possible to measure the welfare impact of such reforms independently of the behavioral adjustment that individuals may make in reaction to these reforms.
We consider five such marginal reforms in this chapter. The first reform one channels public expenditure benefits to members of specific and easily observable socioeconomic groups. The main issue then is: for which socioeconomic group is additional public money best spent to reduce aggregate poverty? The second type of reform consists in an increase in public expenditures that raises all incomes in some socioeconomic groups by some proportional amount. Again, an important question is: For which socioeconomic group would this increase in public expenditures reduce aggregate poverty the fastest? This second type of reform can also be thought as (for instance) a process that increases the quality of infrastructure and the quantity of economic activity in a particular group or region in a way that affects proportionally all incomes and that is thus distributionally neutral in the sense of not affecting inequality within the groups affected.
The third type of reform considers a change in the price of some commodities, either through some macroeconomic or external shocks, or through a change in commodity taxes or subsidies. How is the distribution of wellbeing, and poverty in particular, affected by such a price change? The fourth question we ask is: what type of reform to a system of commodity taxes and subsidies could we implement, with no change in overall government revenues, but with a fall in poverty? That is, which commodities should be prime targets for a reduction in their tax rate or for an increase in their rate of subsidy and which others should see their tax rate increase? The fifth and last type of reform affects proportionally all incomes of a certain type—such as some type of farm income, the labor income of some type of workers, etc... Which sort of income sources should the government attempt to bolster if the primary aim is to alleviate poverty?
For all such reforms, we measure their poverty impact by the change in the FGT poverty indices that they cause. Recall that the use of the FGT indices is closely connected to checks for stochastic dominance and for the ethical robustness of poverty changes. Hence, we can use the methods below to determine how the reforms affect poverty as measured not only by the FGT poverty indices, but also by all of the poverty indices that obey some ethical conditions. For instance, if we find that some form of targeting decreases a FGT index of some α value for a range [0, z+] of poverty lines, then we know that the reform will also decrease all poverty indices of ethical order α + 1, whatever the choice of a poverty line within [0, z+].
We consider first the effect of a transfer of a constant amount of income to everyone in a group k. For this, recall that the FGT index can be decomposed as:
The per capita cost to the government of granting an equal amount η(k) to each member of a group k is equal to:
Aggregate poverty after such transfers equals P(k; z; a):
To determine which group k should be of greatest priority for the targeting of government expenditures, we need to determine for which group k targeted government expenditures (in the form of ηk) reduce aggregate poverty the most per government dollar spent. In other words, we need to compare across k the aggregate poverty reduction benefits of targeting one government dollar to a group k.
When a ≠ 0, we can show that the marginal reduction of aggregate poverty per dollar of per capita government expenditures is given by1:
and, for the normalized FGT, by:
E:18.8.45
To reduce P(z; α) the most, we must therefore target those groups for which P(k; z; α – 1) is the greatest. It is thus simply the FGT index with α – 1 that guides policy based on reducing FGT with α The greater the value of α, the greater the chance that we will favor those groups where extreme poverty is highest.
E: 18.8.34
When a = 0, the per dollar reduction of aggregate poverty is given by f(k; z), the group k's density of income at the poverty line2:
We must then target those groups with the greatest density of people just around the poverty line, regardless of how much poverty there is below that poverty line—another consequence of the insensitivity of the headcount index to the distribution of incomes below z.
Table 12.1 summarizes the marginal poverty impact of targeting a constant amount to everyone in the population, or only to those in a group k. The impact is shown both for the normalized and unnormalized FGT indices. The bottom part of the table shows the poverty impact relative to the overall per capita cost of the targeting program.
Consider now a transfer that increases by a proportion λ(k)– 1 the income Q(k;p) of each member of a group k. The increase in income is thus
1 DAD: povertylumpsum Targeting.
2 DAD: povertylumpsum Targeting.
(λ(k)–l)Q(k;p). The FGT index for group k after such a transfer is then3
For α ≠ 0, the marginal impact of a change in λ(k) is given by
and by
for the normalized FGT index
E:18.8.37
How (12.8) (and (12.9)) varies across values of k depends on two factors. First, there is the factor [P(k;z;α) – zP(k;z;α – 1). Groups in which there is a significant presence of extreme poverty will tend to see their P(k;z;α) poverty indices fall significantly with α, and will thus exhibit a large value of [P(k; z; α)–zP(k; z; α – 1)]. We may thus expect that these groups should be a priority for government targeting. However, those groups with considerable incidence of extreme poverty are also those for which a proportional increase in income has the least impact on the average income of the poor—since there is then little income on which growth may have an effect. Hence, whether those groups with a higher incidence of extreme poverty will exhibit a higher value of [P(k; z; α)–zP(k; z;α–1)] is ambiguous.
The second factor that enters into (12.8) is population share ø(k). Ceteris paribus, targeting government expenditures (in the form of an increase in λ(k)) to groups with a higher population share will naturally tend to decrease overall poverty faster. But this fails to take into account that a given increase in λ(k) will generally be more costly for the government to attain for groups with a large share of the population. Because of this, we may instead wish to compare across groups the ratio of the benefit in poverty reduction to the group per capita increase in income. Assume for simplicity that the cost of this group per capita income increase is entirely borne by the government. The per capita revenue impact of such a transfer on the government budget equals where:
3 DAD: Povertyinequalityneutral Targeting
When α ≠ 0, the reduction of aggregate poverty per dollar spent per capita is then
and
for the normalized FGT index. To reduce P(z; α) the fastest, the government should therefore target those groups for which the term on the right is the greatest in absolute value. Compared to (12.8), (12.11) and (12.12) do not feature population shares since these shares are cancelled by the revenue impact of the government transfer. There now appears, however, the term μ(k) in the denominator. Indeed, if it must bear the entire cost of the income increase, the government will have to pay more to achieve a given increase in λ(k) for those groups with a high average income than for those groups with a lower average income level. Finally, and for the same reasons as those mentioned above, whether those groups with a higher incidence of extreme poverty will exhibit a higher value of [P(k; z; α)–zP(k; z;α– 1)] is ambiguous.
When α = 0, the perdollar reduction of aggregate poverty following a proportionaltoincome transfer is given by
Those groups with a high density of income at the poverty line, and whose average income is small, are then a prime target for povertyefficient proportionaltoincome transfer scheme.
Table 12.2 summarizes the marginal poverty impact of Inequalityneutral targeting either to everyone in the population, or to only those in a group k. The impact is shown both for the normalized and unnormalized FGT indices; as for Table 12.1, the bottom part of the table shows the poverty impact relative to the overall per capita cost of the targeting program.
Variability of poverty line estimates across time, space, or poverty analyses and institutions can occur for several reasons. There may be methodological uncertainty and divergences as to how poverty lines should be estimated (recall Chapter 6). Estimation (sampling and nonsampling) errors also occur for purely statistical and survey reasons (see Chapters 16 and 17). Poverty lines may also be updated with time due to new data becoming available, or due
to the evolution of some form of socially representative or reference income. Whatever the reason, it may be useful, given this uncertainty, to assess how responsive poverty measurement will be to such variability in poverty line estimates.
To do this, consider first the case of the unnormalized FGT indices. We find that
For the headcount index, what matters is thus the income density at the poverty line. For higherα indices, the sensitivity to the poverty line is given simply by P(z; α – 1}. The elasticity of FGT indices to the poverty line then follows as
Note that the elasticity of the headcount index has a useful graphical interpretation. Consider Figure 12.1 which shows the income density f(y) at different values of y. The area underneath the f(y) curve up to y = z gives the headcount P(α = 0; z) = F(z). The value of zf(k; z) is given by the size of the rectangle with width z and height f(z) in Figure 12.1. Hence, the elasticity of the headcount with respect to the poverty line is simply the ratio of the rectangular area zf(k; z) over the shaded area F(z). This elasticity is larger than 1 whenever the poverty line z is lower than the (first) mode of the distribution, and will in fact be above 1 in Figure 12.1 for any poverty line up to approximatively z'. For poverty lines larger than z', the poverty elasticity falls below 1. Thus, it is only for societies in which the headcount is initially high that we can expect the elasticity of the headcount with respect to the poverty line to be lower than 1. Otherwise, a change of 1% in the poverty line will cause a change of more than 1% in the headcount index. For normalized FGT indices, we obtain4
and
for the corresponding elasticities. Although expressed differently, the elasticities in (12.15) and (12.17) are the same.
4 DAD: PovertyFGT Elasticity.
The level of prices is an important determinant of the distribution of incomes, and can therefore matter significantly for poverty analysis. Governments can affect their levels directly or indirectly, through the use of sales and indirect taxes, competition policy, export taxes and import duties, subsidies on food, education, energy or transportation, etc..
To see how changes in prices (and therefore how pricechanging reforms) can impact poverty, let y be a householdspecific level of exogenous income, and express consumers' preferences as ν The indirect utility function is given by V(y, q; ν), where q is a vector of consumer and producer prices. We define a vector of reference prices as qR—this is necessary to assess consumers' wellbeing at constant prices. Denote the real income in the postreform situation by yR, where yR is measured on the basis of the reference prices qR. yRis implicitly defined by v (yR, qR;ν) = v (y, q; ν,) and explicitly by the real income function yR = R (y, q, qR ν), where
By definition, yR gives the level of income that provides under qR the same utility as y yields under q.
We then wish to determine how real incomes are affected by a marginal change in prices. Let xc (y, q; ν) be the net consumption of good c (which can be negative if the individual or household is a net producer of good c) of a consumer/producer with income y, preferences ν and facing prices q. Let qc be the price of good c. We thus have y = Σqcxc(y,q;v). Differentiating (12.18), we find:
Using Roy's identity and setting reference prices to prereform prices, this leads to:
Equation (12.20) says that the observed prereform net consumption of good c is a sufficient statistic to know the impact on real income of a marginal change in the price of good c. This simple relationship is also valid for rationed goods. Equation (12.20) gives a "firstorder approximation" to the true change
in real income that occurs following a change in the price of good c. The approximation is exact when the price change is marginal. It is less exact if the price change is nonmarginal and if the compensated demand for good c varies significantly with qc.
Assume that preferences ν and exogenous income y are jointly distributed according to the distribution function F(y, ν). The conditional distribution of ν given y is denoted by F (ν\y), and the marginal distribution of income y is given by F(y). Let preferences belong to the set Θ and assume income to be distributed over [0, a]. Expected consumption of good c at income y is given by xc(y,q), such that
where Eν indicates that the expected consumption of good c is taken over all preferences in the set Θ. By (12.20), –xc(y, q) is also proportional to the expected fall in real incomes of those with income y following an increase in qc
Let xc(q) then be the per capital consumption of good c, defined as . By (12.20), xc(q) is also the average welfare cost of an increase in the price of good c. As a proportion of per capita consumption, consumption of good c at income y is expressed as
We can now see how the FGT indices are affected by a change in the price of good c. (For unnormalized FGT indices, we simply multiply the results by zα.) Using (12.20), we find that:
E: 18.8.39
where f(z) is again the density of income at z. When graphed over a range of poverty lines z, this effect generates the socalled "consumption dominance" CDc(z; a) curve of a good c5:
E: 18.8.40
Note that the impact on poverty depends on α and z. By (12.22), CD(z; α = 0) only takes into account the consumption pattern of those precisely at z. The impact of an increase in the price of good c on the headcount index will be large if there are many individuals bordering the poverty line (f(z) is then large) and/or if these individuals consume much of good c (xc (z, qR) is then
5 DAD: CurvesCDominance and DAD: Povertyimpact of price change.
large). The CDc(z;α = 1) curve gives the absolute contribution to total consumption of good c of those individuals with income less than z. It is therefore an informative statistics on the distribution of consumption expenditures, similar in content to the generalized concentration curve GCxc (p) for good c — which gives the absolute contribution to total xc consumption of those below a certain rank p. For α = 2, 3,..., progressively greater weight is given to the shares of those with higher poverty gaps.
The above section gave us the tools needed to assess the impact of marginal price changes on poverty. We may also use these tools to assess whether a revenueneutral tax and subsidy reform could be implemented that would reduce aggregate poverty.
For this, we need to take into account the government budget constraint, and more particularly the net revenues that the government raises from a policy of commodity taxes and subsidies. Let t be the vector of tax rates on the C goods. Setting producer prices to 1 and assuming them to be constant (for simplicity) and invariant to changes in t, we then have q = 1 + t and dqc = dtc, where tc denotes the tax rate on good c. Let per capita net commodity tax revenues be denoted as R(q). They are equal to . Without loss of generality, assume that the government's tax reform increases the tax rate on the jth commodity and uses the extra revenue raised to decrease the tax rate (or to increase the subsidy) on the Ith commodity. Revenue neutrality of the tax reform requires that
Now define γ as
The numerator in (12.25) gives the marginal tax revenue of a marginal increase in the price of good I, per unit of the average welfare cost that this price increase imposes on consumers. Equivalently, this is 1 minus the deadweight loss of taxing good l, or the inverse of the marginal economic efficiency cost of funds (MECF) from taxing l (see Wildasin (1984)). The denominator gives exactly the same measures for an increase in the price of good j. γis thus the economic (or "average") efficiency of taxing good I relative to taxing good j. We may thus interpret γ as the efficiency cost of taxing j relative to that of taxing l (the MECF for j over that for I). The higher the value of γ the less economically efficient is taxing good j.
By simple algebraic manipulation, we can then rewrite equation (12.24) as
which fixes dqj as a revenueneutral proportion of dql. This last relationship yields a nice synthetic expression for the impact on a FGT index P(z; a) of a revenueneutral tax reform that increases the tax on a good l for the benefit of a fall in the tax on a good j6
We then wish to check whether such a tax reform would lead to a fall in poverty. For the fall to "ethically robust", we would want to check that it occurs for any one of the poverty indices of some ethical order and for a range of poverty lines. To test this, it is useful to define normalized CD curves and to denote them as (z; α). Normalized CD curves are just the abovedefined CDc curves for good c normalized by the average consumption of that good, xc(q):
E: 18.8.30
curves are thus the ethically weighted (or social) cost of taxing c as a proportion of the average welfare cost. Comparing normalized (z; α) curves thus allows comparing the distributive benefits of decreasing tax rates (or increasing subsidies) across commodities, per dollar of average welfare benefit. If , then poverty falls faster per dollar of average welfare benefit if taxes on l are decreased (instead of taxes on j)7.
For overall social efficiency, we must also take into account the parameter of economic efficiency, γ This parameter translates tax revenue into average welfare changes. Suppose that we were to envisage a revenue neutral tax reform that decreases tl but increases tj. It follows from (12.27) that this tax reform is poverty reducing is and only if
6 DAD: PovertyImpact of Tax Reform.
7 DAD: CurvesCDominance.
Recall from (12.25) that when γexceeds 1, the economic efficiency cost of taxing j exceeds that of taxing l. Considering economic efficiency alone then suggests increasing tl and decreasing tj.
The lefthandside of (12.29) shows the distributive benefit of the reform. It compares the fall in poverty following a decrease in tl versus that following of a fall in tj, in each case per dollar of average welfare gain. Ignoring economic efficiency considerations, decreasing tl and increasing t jis then poverty reducing if that difference is positive. Condition (12.29) therefore says that decreasing tl but increasing tj reduces poverty if the distributive benefit of such a reform is larger than its economic efficiency cost.
We may then check whether a tax reform is "poverty efficient" and ethically robust by verifying whether the following condition holds8:
E: 18.8.52
To interpret (12.30), it is useful to recall the general poverty dominance results of (10.14). Using (10.14), it follows that if condition (12.30) holds, then all of the poverty indices that are members of the class Π α+1 (z +) (of ethical order α + 1) will decrease following a revenueneutral fall in tl and a rise in tj. This can be summarized as:
sthorder poverty dominant tax reform: A revenueneutral marginal tax reform that decreases tl and increases tj will decrease all poverty indices that are members of Πs(z +) if and only if9
E: 18.8.41
Considering the relationship between poverty and welfare dominance (see page 184), a similar result holds for welfare dominance:
sthorder welfare dominant tax reform: A revenueneutral marginal tax reform that decreases tl and increases tj will increase all social welfare indices that are members of Πsif and only if
It is just a matter of notational change to use the tools developed above to assess the poverty impact of growth in some income component, in some sector of economic activity, or for some socioeconomic group. We will then be able to assess, for instance, by how much aggregate poverty would fall per
8 DAD: DominanceIndirect Tax Dominance.
9 DAD: DominanceIndirect Tax Dominance.
percentage of growth rate in the industrialized sector (a sectoral change), or per dollar of growth in agricultural income (an income component that enters into aggregate income), or in some region.
Assume that total income X is the sum of C income components, with quantile , where λc is a factor that multiplies income component and where (p) is the expected value of income component c at rank p in the distribution of total income. Again, (p) can be, for instance, agricultural or capital income, or the income of those living in some geographic area.
The derivative of the normalized FGT index with respect to λc is then given by10
E:18.8.42
where this CDc(z;α) curve can now be interpreted as a " component dominance" curve for income component X(c). It can be defined formally as11:
Multiplied by a proportional change dλc, CDc(z; a) gives the marginal change in the FGT indices that we can expect from growth in a component c. Note that the derivative of the unnormalized index P(z;α) is simply za CDc(z; a).
We can intuitively expect, however, that a given percentage change will have a larger poverty impact when it applies to a larger sector or income component. To take this element into account and to normalize by the importance of the component, we may wish instead to compute the change in the FGT indices per dollar of per capita growth in the overall economy, when that growth comes exclusively from growth in a component c. This is given by the normalized CD curves for component c12:
or by for the unnormalized FGT index.
Note that the richer the society, the lower will the fall in poverty tend to be per dollar of per capita growth. This is so for two reasons. First, a richer
10 DAD: PovertylncomeComponent Proportional Growth.
11 DAD: CurvesCDominance.
12 DAD: CurvesCDominance.
society will tend to have a lower level of poverty and fewer poor, and hence there is less scope in such an environment for poverty to decrease significantly in absolute terms. This is captured in (12.34) by a lower value of f(z) and of [z–X(p)]+. Second, in a richer society, a 1% increase in some component will generate a larger level of per capita growth in dollar terms. This is captured by a larger µX(c). Both factors will thus tend to push (12.35) downwards. Thus, growth will arithmetically tend to have a smaller absolute poverty impact in richer societies.
An alternative indicator of the poverty impact of growth is the elasticity of poverty with respect to overall growth, where again that overall growth comes strictly from growth in a component X(c). From (12.35), this is given by
for both normalized and unnormalized FGT indices. Expressed as elasticities, the impact of income component and sectoral growth will tend to revert to comparable magnitudes between rich and poor countries. As shown by the righthandside of (12.36), that magnitude will mostly depend on the importance of component X(c) among the poor (the term as a proportion of the importance of component X(c) in total income (the term µX(c)/µX).
Note, therefore, that the use of poverty elasticities as opposed to poverty changes will often give a different picture of where growth is (or has been) most effective in reducing poverty. Using absolute poverty changes (12.35) will usually suggest that growth reduces poverty most in poorer countries; using elasticities (12.36) may instead imply that growth reduces poverty most in richer countries.
How fast can inequalityneutral growth in the economy be expected to reduce poverty? On which group can inequalityneutral growth be expected to reduce aggregate poverty the fastest? And in which group would poverty fall the fastest due to such growth ?
Using (12.8) above, it can be shown that the elasticity of total FGT poverty with respect to total income–when growth in total income comes exclusively from inequalityneutral growth in group k – equals εy(k;z;α)13:
E:18.8.47
13 DAD: PovertyFGT Elasticity.
for α≠0. When α = 0, (12.37) becomes:
Equations (12.37) and (12.38) can be used and interpreted in a number of interesting ways.
1 Replacing P(k;z;α) by P(z;α), P(k;z;α  1) by P(z;α – 1), f(k;z) by f(z), and µ(k) by µ, in (12.37) and (12.38) gives as a special case the elasticity of total poverty with respect to inequalityneutral growth in the overall economy, εy(z;α).
2 Replacing P(z; α) by P(k; z; α), F(z) by F(k; z) and µby µ(k) in (12.37) and (12.38) yields the elasticity of poverty in group k with respect to inequalityneutral growth in the income of that same group.
3 As discussed above, the most beneficial source of growth (for overall poverty reduction) may not come from those groups with greatest poverty. Groups in which poverty is highest will tend to have a large [P(k; z; α)–zP(k; z; α–1)], but we also need to consider the ratio of µ(k) to µ: high poverty in a group can also be associated with a high level of average income.
4 The growth elasticity of the headcount, , has a nice graphical interpretation. To see this, consider Figure 12.1 where the density f(y) of income at different y is shown. Recall that the area underneath the f(y) curve up to y = z gives the headcount F(z). The term z. f(z) in (12.38) is the area in Figure 12.1 of the rectangle with width z and height f(z). Hence, the elasticity (in absolute value) of the headcount with respect to inequalityneutral growth is given in Figure 12.1 by the ratio of the rectangular area z. f(z) over the shaded area F(z).
It is clear, then, that this elasticity is larger than one whenever the poverty line z is lower than the (first) mode of the distribution. In fact, it will be above one in Figure 12.1 for any poverty line up to approximatively z'. For poverty lines larger than z', the growth elasticity will in absolute value fall below 1.
This can have important policy consequences. For societies in which the poverty line is deemed to be lower than the mode (which is usually not far
from the median), then the headcount in these societies will fall at a proportional rate that is faster than the growth rate in average incomes. But for societies in which the headcount is initially high (larger than 0.5, say), we can expect the growth elasticity of the headcount to be lower than 1. This implies that inequalityneutral growth can be expected to have a proportionately smaller impact on the number of the poor in poorer societies than in richer ones.
5 The growth elasticity of the average poverty gap, εy(z; α = 1), also has a nice interpretation. Denote the average income of the poor by µP(Z)= can then be expressed as:
Hence, the growth elasticity of the average poverty gap is simply (minus) the ratio of the poor's average income to the poor's average distance to the poverty line. Because this only takes into account the average income of the poor, however numerous or few they may be, the elasticity εy(z; α–1) can easily be misleading. A society A with a small headcount and with a given εp(z) and a society B with a much larger headcount but the same εp(z) will exhibit the same growth elasticity, although intuitively we might feel that growth would decrease poverty more in B than in A.
It may also be of interest to predict how changes in inequality will affect poverty. The immediate difficulty here is that, unlike the case of growth in mean income, it is not immediately obvious which pattern of changing inequality we should consider. Indeed, as discussed above, a natural reference case for analyzing the impact of growth is the case of inequalityneutral growth–all incomes then vary proportionately by the same growth rate in mean income. For inequality changes, which inequality index should we use to measure inequality? And, supposing that we were to agree on the choice of such a summary inequality index, which of the many different ways in which that index can change by a given amount should we choose? Each of these different ways can have a dramatically different impact on poverty.
To make this difficulty slightly more concrete, suppose that we wish to understand the impact of an increase in the Gini index on the poverty headcount (this is often done in aggregate "inequalitypovertygrowth regressions"). Also suppose that this increase in the Gini comes from a meanneutral increase by some constant in the gap between two quantiles Q(p1) and Q(p2), with P2–P1 = η > 0. From (4.12), note that the impact of this on the Gini is the
same, whatever the value of p1. There are, however, several possible reactions of the headcount following this increase in the Gini:
1 If p1 is well above F(z), the headcount will not change;
2 If p1 is just above F(z), the headcount will increase;
3 If p1 is below F(z) and p2 is above F(z), the headcount will not change;
4 If p2 is just below F(z), the headcount will fall;
5 If p2 is well below F(z), the headcount will not change.
Clearly, even in this very special setting, the relationship between poverty and inequality is far from being unambiguous.
So trying to predict the effect on poverty of a process of changing inequality, through the use of a single inequality index, is really to ask too much of summary indices of inequality. There cannot exist any stable structural relationship between inequality indices and poverty, even assuming mean income to be constant. This in fact casts serious doubt on the structural soundness of the many studies that regress past changes in poverty indices upon past changes in inequality indices, and which then try to explain or predict the impact of changing inequality on poverty.
What can be done, however, is to illustrate how some peculiar and simplistic pattern of changing inequality can affect poverty. Such an illustration can be made using the singleparameter (λ) process of bipolarization shown by equation (4.15). How does poverty change when inequality changes due to this bipolarization ? For this, we use the most popular indices of poverty and inequality, the FGT and the Gini indices (the result is exactly the same if we use the broader class of SGini indices). Assume that the change in inequality comes from a λ that moves marginally away from 1. The impact on the normalized FGT index is given by
Thus, the elasticity of the (normalized and unnormalized) FGT poverty indices with respect to the Gini index is obtained as εG(z; α)14,
14 DAD: PovertyFGT Elasticity.
for α > 0. When the headcount is used, we have
and thus15
Note that even with this highly simplified process of changing inequality, the impact on poverty is ambiguous. It depends in part on the sign of (µ–z). When mean income is below the poverty line, an increase in the Gini index can—and, for the headcount index, will—imply a fall in poverty.
We may now turn to the impact of policy and growth on inequality. The approach we use enables us to consider the impact on inequality of several ways in which income changes may occur. One is growth that takes place within a particular socioeconomic group. Another is growth that affects the value of some income sources—such as agricultural income or informal urban labor income. Another is the impact of price changes, which affect real income and its distribution. One more is the impact of changes in some tax or benefit policies, such as changing the subsidy rates on some production or consumption activity, or increasing the amount of monetary transfers made to some socioeconomic groups.
For each such incomechanging phenomenon, we may be interested in the absolute amount by which inequality will change, or in the absolute amount by which inequality will change for each percentage change in mean real income, or in the elasticity of inequality with respect to mean income.
Assume that we have as above that total income X is the sum of C components, Xc), to which we apply again a factor λc to yield . We then have that . If we are interested in total consumption, then we may think of the X(c) as different types of consumption expenditures. If we are thinking of tax and benefit policy, then some of the X(c) may be transfers or taxes. If we are alternatively concerned with the impact of sectoral growth on income inequality, then we may think of the X(c) as different sources of income, or of the income of different socio
15 DAD: PovertyFGT Elasticity,
economic groups. By how much, then, is inequality affected by variations in λc?16
We will consider two ways of measuring inequality, the Lorenz curve and the SGini inequality indices—of which the traditional Gini is again a special case. The derivative of the Lorenz curve of X with respect to λc is given by:
E:18.8.49
Equation (12.44) therefore gives the change in the Lorenz curve per unit ofλc, that is, per 100% proportional change in the value of X(c). Say that we predict that income component Xc will increase by approximately 10% over the next year17. We can then predict that the Lorenz curve Lx(p) will move by approximately 10% of (12.44) over that same period. How big an impact this will be on inequality will depend of course on the size of the proportional change, on the importance of the component (µX(c)), and on the concentration of the component relative to that of total incomes (the difference Cx(c) (p) Lx(p)).
A similar result is obtained for the Gini indices18:
E:18.8.50
Thus, if for instance the removal of a subsidy or the advent of an external shock is foreseen to increase by 10% the price of a good X(c), the Gini index can be predicted to move by approximately– [10%.µx(c)/µx (Cx(c)(p)Lx(p))]. (The negative sign comes from the fact that an increase in the price of a consumption good leads to a fall in the real value of the expenditures made on that good.) The impact per dollar of change in per capita income is then given by
We may also wish to assess the impact on inequality of a change in λc per 100% of mean income change. This is given by
16 DAD: InequalityIncomeComponent Proportional Growth.
17 DAD: CurvesLorenz and DAD: CurvesConcentration.
18 DAD: InequalityGini/SGini index and DAD: RedistributionCoefficient of Concentration.
for the Lorenz curve and by
for the Gini indices. These expressions are simple to compute and have a nice interpretation. Multiplying the above two expressions by the proportional impact that some change in X(c) is predicted to have on total per capita income gives the predicted absolute change in inequality. For instance, if we predict that growth in rural areas will lift mean income in a country by 5%, then the Lorenz curve of total income Lx(p) will shift by approximately 0.05 (Cx(c)(p)–Lx(p)), where X(c) is rural income. If rural income is more concentrated among the poor than total income, this will push the Lorenz curves up; otherwise, growth in rural income will increase inequality.
Finally, we may prefer to know the elasticity of inequality with respect to µX, when growth comes entirely from X(c). It is given by
for the Lorenz curve and by
for the Gini indices. Thus, a proportional increase in taxes that reduces total mean net income by 1 percent will change the Gini index by 1 – ICx(C)(ρ)/Ix(ρ) percent, where ICx(c) is the concentration index of taxes X(c). This will decrease inequality if taxes are more concentrated than net income: ICx(c)(ρ)/Ix(ρ) > 1. The elasticity of the Lorenz curve and of the Gini indices with respect to µx(c) when growth comes entirely from a proportional change in X(c) is finally given by
and
As in the case of poverty (recall Section 12.4), it is useful to assess the impact of a price reform (through consumption and production taxation) on inequality. Assume that we are interested in the effects of a revenueneutral marginal tax reform that increases the tax on a good j for a benefit of a fall in the tax on a good l. Recall that γis the MECF for j over that for l — the larger the value of γ, the lower the fall in tl that we can generate for a given revenueneutral increase in tj. Denoting real income by yR, the impact of a marginal revenueneutral increase in the price of good j is then
on the Lorenz curve and an impact
on the Gini indices19. When γ = 1, viz, when the marginal economic effciency of taxing l and j is the same, expressions (12.54) and (12.56) reduce to a proportion of the difference between the concentration curves and the concentration indices for the two goods. For instance, the change in the SGini inequality indices is then given by:
E:18.8.48
It is then better for inequality reduction to tax more the good that is less concentrated among the poor, for the benefit of a reduction in the tax rate on the other good, which is less concentrated among the rich.
We may also wish to express the above changes in inequality per 100% change in the value of per capita real income. This is then given by
19 DAD: CurvesLorenz and DAD: CurvesConcentration.
for the Lorenz curve and
for the Gini indices.
The literature on the empirical effectiveness of targeting schemes has grown substantially in the last years. See for instance Bisogno and Chong (2001) (on the effectiveness of proxy means tests), Hungerford (1996) (on the effectiveness of the targeting of social expenditures in the US), Gueron (1990) (on the effectiveness of targeted employment programs in the US), Park, Wang, and Wu (2002) (on the effectiveness of targeting in China), Ravallion, van de Walle, and Gautam (1995) (on the effect of the targeting of social programs on persistent and transient poverty in Hungary), Ravallion (2002) (on the variability of targeting across economic cycles in Argentina), Schady (2002) (on the potential for geographic targeting in Peru), and Moffitt (1989) and Slesnick (1996) (on whether inkind transfers are efficient for poverty reduction).
The recent literature has also queried whether "finer" geographical targeting schemes lead to more equitable and more effective poverty reduction. Evidence on this issue–which is linked to the broader context of the benefits and costs of decentralization–is discussed inter alia in Alderman (2002) (for Albania), Bigman and Srinivasan (2002) (for India), and Ravallion (1999).
The targeting literature has often resorted to an analysis of programs' "targeting errors" and how they vary with program reforms. These errors are variably called "leakage" and "undercoverage" errors, "E" and "F" mistakes, and "Type I" and "Type II" errors. Discussion and use of them can be found in van de Walle and Nead (1995), Cornia and Stewart (1995) and Grosh (1995). See also van de Walle (1998b) (on the virtues and costs of "narrow and broad" targeting), and Wodon (1997b) (for use of "ROC curves" to study the performance of targeting indicators).
Work on the impact of marginal price changes on wellbeing and welfare includes: Ahmad and Stern (1984), Ahmad and Stern (1991), Creedy (1999b), Creedy (2001), Newbery (1995) and Stern (1984), for the impact of indirect marginal indirect tax reforms on some parametric social welfare functions, making use among other things of the "distributional characteristics" of goods; Mayeres and Proost (2001), for the impact of marginal indirect tax reforms in the presence of an externality (peak car transport); Besley and Kanbur (1988), for the impact of marginal changes in food subsidies on FGT poverty indices; Creedy and van de ven (1997), Creedy (1998a) and Creedy (1998b), for the impact of price changes and inflation on wellbeing and social welfare; Liberati (2001), Mayshar and Yitzhaki (1995), Mayshar and Yitzhaki (1996), Yitzhaki and Thirsk (1990), Yitzhaki and Slemrod (1991) and Yitzhaki and Lewis (1996), for the impact of marginal indirect tax reforms on classes of social welfare indices using "marginal" dominance analysis; Lundin (2001); for marginal dominance analysis for a marginal tax reform affecting the importance of an externality (the presence of carbon dioxide); Makdissi and Wodon (2002), for the use of CD curves in the analysis of marginal poverty dominance; and Yitzhaki (1997), for the impact on inequality of marginal price changes.
This page intentionally left blank.
The lumpsum targeting schemes analyzed in Chapter 12 assumed that there exist characteristics on which governments can condition benefit transfers. For instance, we modelled the impact on poverty of giving $1 to everyone that belonged to some sociodemographic group k. These transfers were not decreasing with levels of income since we implicitly assumed that income levels were not directly observable. The tools derived in Chapter 12 enabled us, however, to identify on which observable socioeconomic characteristics we should condition transfers to reduce poverty fastest.
We will suppose now that the distribution of population characteristics (including the levels of original income) can be observed without costs (for expositional simplicity), but that there exist costs to granting state support. We will see that the optimal targeting rules that follow are different from those of the traditional study of optimal income taxation, where labor supply and income generation are endogenous but where redistributive imperfections are generally ruled out. Instead, assume that the behavior of agents is fixed (e.g., constrained by labor market conditions) under alternative income support rules, except for the feature that such agents may freely choose whether to participate in the income support programs. Given the plausible presence of redistributive costs whose size may vary with individuals, the state then wishes to minimize the value of a poverty index, taking into account either the opportunity cost of government expenditures or the constraint of an aggregate redistributive budget for poverty alleviation. As we will see, the existence of redistributive costs leads to policy criteria that weigh efficiency as well as redistributive objectives. It also has important implications for the consideration of the principles of vertical and horizontal equity.
Redistributive costs can first arise from the efforts made by governments to monitor true levels of income. They can be interpreted in a sense as the certaintyequivalent costs of the presence of imperfect information. The more difficult is it to ascertain accurately someone's true income, the greater the expense of removing the associated information imperfections. Redistributive costs can also be incurred by benefit recipients and they may then have to be deducted from the gross impact of state support in order to yield net poverty relief. For expositional simplicity, we assume here that all costs take the form of a participation burden and that they are borne directly by the participating poor.
Assume that the poverty alleviation objectives of the state are to minimize the poverty index, P(z):
where yi is the initial income of individual i, z is the poverty line, and NBi is the net benefit to individual i of the availability of a nonnegative gross benefit Bi*. As we shall define it below more precisely, NBi is no greater than Bi* since it is reduced by the administrative and participation costs involved in transferring Bi*
The government allocates a total per capita budget to the minimization of poverty, such that
with Bi being the level of gross benefit actually expended to support individual i.
Let Bi* then represent the benefit offered to individual i, and denote by Ci the nonnegative cost to i of accepting Bi*. If Bi* is less than Ci, then the benefit awarded Bi and the net benefit NBi will be zero. When then and Define an indicator function I[x] that takes the value 1 when x is true and 0 when x is false. Then
costs Ci are only incurred when Bi* is taken up. Think for instance of ci as an administrative cost necessary to grant support to i. Bi* is then the level of gross expenditures which the state would consider spending on i, B i is the level of gross expenditures actually spent on i, and NBi is the level of benefit net of administrative costs that eventually reaches the individual.
The government thus wishes to choose the various Bi*, i = 1,..., n, to minimize (13.1) subject to (13.2). Note that NBi and Bi are not differentiable with respect to Bi* at the point at which i just accepts state support, viz, when This causes no analytical difficulty since as we will see the optimum solutions for the Bi* never have to lie at these corner points.
Define λ as the Lagrange multiplier associated to the budget constraint, and π(1) as the nonnegative derivative of π with respect to y. For now, assume that π is continuous, differentiable and convex — we will discuss later the important headcount case for which π does not fulfill these conditions. The government then wishes to ensure that the following condition is met at the optimum values of Bi and λ (given by Bi* and λ*) for each of the i in receipt of state support:
The optimum value of λ reflects the social opportunity cost of spending public resources. Note that a benefit offer Bi* below ci will not matter, for then Bi = NBi = 0, that is, it has neither a cost nor a benefit.
Whether i should derive any net benefit at the optimum solution depends on its original income yi and on the redistributive cost ci that he faces. Figure 13.1 illustrates this dependency. The straight line λBi* displays the opportunity cost in social welfare of granting i a benefit Bi*. The receipt of such a benefit will bring a net benefit NBi that will decrease πi (the contribution of i to the poverty index P) once the redistributive cost has been paid off. The shape of –πi above 0 depends on the convexity of the function π(yi + NBi;z) and on the original income yi. Individuals for whom it is possible to find a level of expenditure Bi* for which will be granted stated support. Whether eventually reaches λBi*— and whether, therefore, the poverty alleviation benefit of granting state support to i is worth its opportunity cost — will thus also hinge on the size of ci, the size of the redistributive costs.
Four cases are shown on Figure 13.1. Individual 1, with expenses c1, will receive benefit B1*, with a net benefit reward of . Individual 2, who faces the same redistributive cost but has a higher original income, will barely be deemed eligible, just as is the case with individual 3 with a lower y but a much higher c. Once benefit recipients, however, individuals 2 and 3 will receive what may be a sizeable net and gross benefit, thus showing an important discontinuity in the function of optimal state support. From the above optimality condition (13.4), we may indeed note that, when Bi* is received, the corresponding net benefit equalizes the postbenefit income (net of redistributive costs) of all benefit recipients. In other words, we have at the optimum that:
Individual 4, who enjoys a relatively large y and also faces high costs, does not benefit from the optimal program. Hence, all those individuals with:
original income greater than y2 and costs greater than c2,
or original income greater than y2 and costs greater than c2,
or original income greater than y3 and costs greater than c3,
or original income greater than y3 and costs greater than c3
ought not to receive income support. The greater the redistributive cost ci, the less the chance of receiving a positive Bi*, but the greater the optimal Bi* is if support should be granted. Furthermore, the greater his original income yi, the less likely an individual is to take up a positive Bi*and the smaller is Bi* if it is received.
Consider now the case in which P(z) = F(z) is the headcount. π(yi + NBi;z) is then discontinuous at the point at which y + NBi reaches z. This leads the state to distribute Bi in such a way as to raise to z as many of the individuals as possible. In order to do this, it will grant income support z – yi + ci first to that poor individual for which that amount is lowest, then to that poor individual with the second lowest z – yi + ci, and so on, until the budget has run out. This relatively straightforward optimal policy is similar to Proposition I of Besley and Coate (1992) in the absence of redistributive costs. We illustrate it on Figure 13.2 supposing that z = 1 and that the budget runs out at an individual with z – y + c = 0.5z, that is, when the government spends half of the poverty line on a recipient. It is clear from the Figure that individuals with original incomes closer to the poverty line are more likely to be optimal benefit recipients. Conditional on being an optimal recipient, however, the expense generated in being awarded a benefit decreases with income and increases with costs.
Consider now the average poverty gap as P(z). It is continuous but not continuously differentiable everywhere in yi = NBi. Choosing to minimize the average poverty gap leads the state to choose benefit recipients such as to maximize the returns in poverty gap reduction per unit of government expenditure. In other words, the state wishes to minimize the aggregate level of redistributive costs incurred for a given total budget spent on the poor. Or, said again differently, the state attempts to fill as much as possible of the total
poverty gap, avoiding as much as possible spending on wasteful redistributive costs.
Because there are fixed costs to granting income support, once an optimal benefit recipient has been identified, the state wishes to spend on him as much as is necessary to raise his net income to z. Thus, the government's optimal strategy is to compute an "efficiency" ratio (z – yi) / (z – yi + ci,) of full poverty gap reduction over benefit expenditure for each individual i, and grant benefit first to that individual i with that greatest efficiency ratio, then to that individual j with the second highest ratio, etc., until the budget is depleted. Because some income support to some relatively poor individuals may yet involve relatively high redistributive costs, the state may find it preferable to grant income support to some richer individuals among the poor.
An individual i should then benefit from state support if the fall in his poverty gap does not fall below the opportunity cost of that fall. For all such benefit recipients, the state also wishes to raise their net income to the poverty line, z, with gross benefits and expenditures equal to Bi = z– y + ci. Hence, at the optimum, an individual i will receive state support if
where λ* is the opportunity cost of government resources at the optimum. A decisionmaker may feel, for instance, that the benefit of a $1 reduction in the poverty gap is at the margin worth $2 in taxes, with a consequent value of λ* =0.5. Thus, for individuals to be optimal benefit recipients, the social benefit of poverty gap reduction, net of the redistributive costs, must exceed the opportunity cost of gross state support. Otherwise, government expenditures would be better spent elsewhere than on poverty relief.
The identification of an optimal set of benefit recipients can thus be made on the basis of an opportunity cost, λ*, and on the interaction of z, yi and Ci. From (13.6), we see that all i with
will be optimal recipients of state support Bi = z– yi + ci. A value of λ* equal to 1 would eliminate all i with c i greater than zero. The lower the value of λ*, the lower the opportunity cost of government expenditures, and the easier it is for poor individuals to qualify for state support. Condition (13.7) above thus explicitly defines a set of income support recipients with a border fixed by a linear tradeoff between ciand yi. To locate precisely that border, the opportunity cost of government expenditures (λ*) must be found or set. This can be done in at least three ways:
through setting λ* directly, taking into consideration the social welfare value of reducing the aggregate tax burden;
through setting a budget level that reflects the government's political or economic "capacity" to pay, and then deriving the implied value of λ*;
through identifying a point (yi, ci) that lies precisely on the border of the "eligibility set", and then calculating the implied λ*.
Let us illustrate the third way – which is both easy to follow and easy to interpret. At the borderline of eligibility, we note that:
Suppose, for instance, that we judge an individual with ci/z = 0.25 and yi/z = 0.5 to be deemed just barely eligible to state support. It follows from (13.8) that λ* = 2/3. This says that a $2 decrease in the average poverty gap is deemed, at the margin, socially worth a $3 increase in per capita taxes. With this information, the entire set of optimal benefit recipients can be identified. The derived value of λ* = 2/3 says, for instance, that all those with no original income at all would yet receive no state support if their redistributive costs exceeded 50% of z. All those deemed eligible will receive Bi = z  yi + ci, and will see their net income raised to z. This is illustrated on Figure 13.3 where z is again set to 1. Both the likelihood of being an optimal benefit recipient and the expense made when awarding a benefit are decreasing with incomes and redistributive costs.
Consider finally the case in which the optimal state support policy must be geared towards minimizing the average of the squared poverty gaps, namely, p(z; a= 2). As for the above, individuals found to be optimal beneficiaries of state support will be those whose fall in poverty exceeds the opportunity cost of the gross expenditures needed to decrease their poverty, viz, those for whom we can find a Bi such that
For beneficiaries (recall (13.5)), we will have that yi+ Bi*–ci = e, where e is that constant to which the net income of all benefit recipients should be raised. Developing (13.9), we find that recipients will meet the condition that:
Because the return to decreasing the squared poverty gap decreases as net income approaches z, redistributive policy will benefit i only if λ* ≤ 2 (z –yi), the initial marginal social, welfare return to raising i's income. If this condition were not satisfied, i would not receive income support even if ciwere nil.
Equation (13.10) implicitly defines the set of the optimal recipients based on their values of yiand ci. Those with low yi or low ci will be granted support. The value e to which the level of all recipients' income will be raised depends implicitly on the opportunity cost λ*. The optimality condition requires that the marginal welfare gain of increasing Bi(when Bi= Bi*) is precisely equal to the opportunity cost λ* of such additional expenditure. If the welfare gain were higher than its opportunity cost, it would be preferable to increase support to the relevant i (instead of granting assistance to a new, additional recipient) since redistributive costs ci would then already have been "sunk". Hence, it must be that
Using (13.10) and (13.11), the border of the eligibility set can now be defined by
To define the set of optimal recipients, we therefore need
either to set directly the opportunity cost of state expenditures, λ*;
to agree on a poverty alleviation budget
to identify one of the border points of the eligibility set;
or to rule on the value e at which the net income of all benefit recipients should be raised.
In everyone of these cases, a value judgement is expressed on the social value of using costly redistributive tools. This value judgement determines the set of the recipients as well as the level of their posttransfer income.
Take the same "borderline" individual as above, with ci / z = 0.25 and yi/z = 0.5. For such a border point, we find e/z = 0.809 and λ*/z = 0.382. Using (13.12), it follows that when yi = 0, for instance, redistributive costs can go up to ci / z = 1.71 as a proportion of the poverty line before income support is withdrawn. For all benefit recipients, net income will be raised to a proportion e / z = 0.809 of the poverty line, with state expenditure on i equal to Bi = 0.809z – yi + ci. Incomes will not be raised to the poverty line since, above yi + NBi = 0.809z, the marginal welfare gain of additional state expenditure is lower than its opportunity cost.
Figure 13.4 summarizes graphically these policy implications for the P(z;a = 2) index by showing the set of the optimal recipients as a function of their original income y and of the redistributive costs c that supporting them generates. The vertical axis shows the level B of expenditures which it is optimal to grant to individuals according to their value of y and c. For ease of reading, all variables are normalized by the poverty line z, which is equivalent to setting z = 1. The set of optimal recipients is clearly non convex, although as we will discuss below, the optimal level of state expenditure shows local linearities with respect to incomes and redistributive costs.
There are several important lessons to be gained from the above discussion, and in particular from Figures 13.2, 13.3 and 13.4. First, state support for the eligible poor compensates them fully for their lower original income and/or higher redistributive costs. In other words, once they become recipients, they should receive support large enough to raise their net income to the level of that of all other optimal recipients.
Second, the case of c = 0 is clearly a special case in which there are fewer support discontinuities. In the more general framework in which redistributive costs c > 0 are allowed, however, some largely intuitive results do not hold any more. It is not true, for instance, that the state is indifferent as to the identity of the poor with the same yi. as seen above, values of c affect who should be targeted for poverty relief. Figure 13.4 also shows that all individuals with zero costs are optimal recipients of state support regardless of their own resources. With higher costs, however, optimal eligibility quickly becomes restricted to the very poor. As redistributive costs rise, the social gain of supporting those with relatively high incomes rapidly falls below the opportunity cost of state resources. Hence, as long as there prevails at least some redistributive cost, not all individuals should be raised to the same final net income, but an optimal selection needs to be made on the basis of original income and levels of redistributive costs.
This last result does not require variability in the redistributive costs across individuals. The more positive the correlation between levels of original income and redistributive costs, the greater the chance that poor individuals would be deemed optimal recipients of state support. But so long as redistributive costs are strictly positive, there will be some poor who will not be optimal benefit recipients. This can be seen on Figure 13.4 for those individuals with y/z at or slightly below 0.8, who become suddenly ineligible with small increases in their c/z. This discontinuity of the optimal level of state support as a function of original income also naturally occurs when using poverty indices that are discontinuous in income (such as the poverty headcount). Redistributive costs introduce these discontinuities for continuous poverty indices as well.
Third, the model above suggests some features of optimal redistribution policy that are somewhat disturbing, at least when considered in the context of the usual discussions of efficiency and equity. On account of the variability of redistributive costs across individuals, some relatively richer individuals might be deemed optimal recipients of income support whereas some poorer individuals might be denied such support. Supporting the poorer and not the richer may generate a greater level of vertical equity and of redistribution, but this is clearly not necessarily optimal if individuals differ in ways (other than their original income) that are relevant to the redistributive effectiveness of the state.
Finally, note in Figure 13.4 that all optimal recipients will receive enough support to raise their net income to 0.809z. There are, however, many individuals with original income less than 0.809z who will not qualify for state support and whose final income will consequently have to remain below 0.809z. Once optimal state support has been allocated, therefore, some of the originally poorer individuals will enjoy a level of net income above that of formerly richer individuals.
This reranking of individuals in the dimension of net incomes and welfare is especially likely when richer individuals present high levels of redistributive costs. It will also occur among those richer and poorer individuals that face identical redistributive expenses. Even more significantly, there are some originally richer individuals with a relatively low ci that will be denied support and end up worse off than some initially poorer individuals with higher ci. Classical horizontal inequity also occurs: individuals with the same original incomes are not all treated alike by the state. If deemed to be socially important, the consideration of horizontal inequity as a social evil would thus necessarily put a constraint on such policies.
The literature on optimal income taxation is large and varied. A review can be found in Slemrod (1990), Stern (1984) and Tuomala (1990) – see also Kanbur, Keen, and Tuomala (1994a). The literature typically allows for labor supply and income generation to be endogenous, but generally supposes the absence of redistributive imperfections – see Stern (1982) for an exception to this.
Budgetary rules under the more specific objective of poverty reduction are discussed in Bourguignon and Fields (1990), Bourguignon and Fields (1997), Kanbur (1985), and Chakravarty and Mukherjee (1998). Additional works on optimal income taxation and optimal benefit provision include Besley (1990) (for a comparison of means testing and universal provision of public assistance), Besley and Coate (1992) and Besley and Coate (1995) (on the desirability of workfare constraints), Creedy (1996) (for a comparison of means testing and linear taxation for poverty reduction), Fortin, Truchon, and Beauséjour (1990) (on comparing workfare and negative income tax systems), Glewwe (1992) (for designing benefit allocation rules when income is not observed), Haddad and Kanbur (1992) (for the potential role of intrahousehold allocation issues), Immonen, Kanbur, Keen, and Tuomala (1998) (for a comparison of means testing and categorical benefit provision), Kanbur, Keen, and Tuomala (1994b) (for differences in the optimal rules implied by welfarist and non welfarist social objectives), Keen (1992) (for the link between needs and optimal allocations of benefits), Thorbecke and Berrian (1992) (for generalequilibrium optimal budgetary rules), Viard (2001) (for a theory of optimal categorical transfer payments), and Wane (2001) (for optimal taxation when poverty generates negative externalities on society).
This page intentionally left blank.
This page intentionally left blank.
DAD — which stands for "Distributive analysis/Analyse distributive" — is designed to facilitate the analysis and the comparison of social welfare, inequality, poverty and equity using micro (or disaggregated) data. It is freely distributed and its use does not require purchasing any commercial software. DAD's features include the estimation of a large number of indices and curves that are useful for distributive comparisons. It also provides various statistical tools to enable statistical inference. Many of DAD's features are useful for estimating the impact of programs (and reforms to these programs) on poverty and equity.
The first version of DAD was launched in September 1998. It initially came to life following a request by the Canadian International Development Research Centre (IDRC) to Université Laval to support research then carried out in Africa in the context of the IDRC's program on the Micro Impacts of Macroeconomic and Adjustment Policies (MIMAP). Improved versions of DAD subsequently appeared as errors and bugs were corrected and as attempts were made to make it more reliable, more flexible and broader in scope. The current version (January 2006) is 4.4.
Several factors motivated us in the process of building DAD. First, there seemed to be an ever increasing need for developingcountry analysts to carry out poverty and inequality "profiles". Much of development policy is indeed now assessed through poverty criteria, and this is carried out among other things through the elaboration of poverty assessments, poverty reduction strategy papers (the now wellknown PRSP's), poverty and social impact analyses, etc.. Much of this distributive assessment had earlier typically been done by foreign consultants and by international organizations' technical staff. Little was left in the form of national capacity building and local empowerment following these largely external exercises. Local researchers and national policy analysts typically felt alienated by these poverty assessments that they often did not understand and that they could not usually influence. To break that segregation between foreign experts and local policy makers and analysts, it seemed useful to introduce tools that would benefit developing country analysts pedagogically and operationally.
Second, microdata accessibility was increasingly becoming less of a problem to developingcountry researchers. This followed what had occurred in more developed countries some 20 years earlier when data tapes and records started to circulate widely in research centers and universities. This was made possible in large part by the amazing increase in storage and processing speed that the computer revolution was creating. Developingcountry analysts were gaining from the same advances, though with some lag due to tighter resource constraints. Furthermore, in addition to the computing and technical demands that handling large data sets involved, developing country analysts often had to deal with data accessibility difficulties. This meant inter alia having to face skepticism and rentseeking behavior from statistical agencies and international organization staff when requesting access to data that were supposed in principle to be public. That problem had also become less severe by the end of the 1990's, in part due to outside pressure. To process and analyze these data then typically became the next barrier to break.
Third, much of distributive analysis was (and is still) handled as if it was not subject to statistical uncertainty. Indeed, a considerable amount of energy and resources seems to be wasted in discussions of poverty and inequality "results" that cannot be trusted on formal statistical grounds. Even changes in poverty headcounts of around 4% or 5% are often statistically insignificant within the usual statistical precision criteria. Needless to say, the efforts deployed by analysts and policy makers to account for variations of less than 1% or 2% (as often occurs) in poverty rates are typically a pure loss of resources. This unfortunate state of affairs needed to be remedied by a much greater use of appropriate statistical techniques. Though conceptually relatively simple, the use of these techniques nevertheless required reading through some technical literature as well as writing tedious computer programs. DAD was in large part written to help bypass these hurdles. Achieving this meant clearing the ground of statistically insignificant results and leaving more time and resources for the interpretation of those distributive findings that were statistically significant.
DAD was thus conceived to help policy analysts and researchers analyze poverty and equity using disaggregated data. An overriding operational objective was to try to make DAD's environment as accessible and as user friendly as possible. Carl Fortin, our coauthor, convincingly argued from the start that we should program DAD in the Java programming language. An objectoriented language, Java created a new paradigm of platform independence: once written, Java applications could run on any operating system as well as on the internet. Conceived by Sun in 1995, Java could still be considered in 1998 to be an infant programming language. By now, however, it has become an important pillar of the programming and internet industry. To make DAD completely free of charge, we also chose not to tie its use to statistical commercial softwares such as Excel, SPSS, SAS or STATA. We therefore opted to design DAD from scratch using some of Java's packages as building blocks.
To make DAD as user friendly as possible, we use popup application windows and spreadsheets as the main working tools. This enables users to visualize a lot of information at a glance, and to manage that information easily. Most of the relevant variables and options needed for running applications can be selected from single application windows. DAD's use of spreadsheets has the advantage of displaying the entire data sets to be used. Small data sets can easily be entered manually. Changes to cell values can be made directly on the spreadsheet. The results of operations on data vectors can be checked easily. DAD also allows loading two data bases simultaneously, and makes it possible to display each of these two data bases alternatively on the spreadsheet. This makes it easy to carry out applications with either one or two data bases. That structure also enables DAD to account for whether the data bases are independent when it comes to computing standard errors on distributive estimators that use information from two samples.
DAD's databases are displayed on spreadsheets similar to those of SPSS, STATA, or Microsoft's Excel — see Figure 14.1. Every line in a sheet represents one observation or one data "record". Typically, an observation consists of one of the sampling or statistical units that were drawn into a survey. In distributive analyses, a sampling unit is often a household since it is households that are typically the last sampled units in surveys. When observations represent households, there will thus be as many lines or observations in the data as there are households drawn into the household survey. The statistical units (or units of interest) are usually (for ethical reasons) the individuals. Even though the sampling units originally drawn into the survey may have been the households, data sets are sometimes reorganized in such a way that each individual in a household is assigned its own line of data. There will then be as many observations in a data set as there are individuals found in the households.
A database used in DAD is then a matrix (a set of columns) whose length is the number of observations discussed above and whose width is the number of variables contained in the database. Each column displays the values of a variable. A variable has as many values as there are observations in the database. All columns in DAD are therefore of the same length. Variable values
can have a "float" format —indicating, for example, the level of household income —or an "integer" format — showing for instance the socioeconomic category to which a household belongs.
There are several options for entering data into DAD. The first one is to create a new database in DAD and then enter the variable values manually. This can be useful for exploratory or pedagogical purposes. Clearly, however, this option is not convenient for entering large databases into DAD. A second option for reading existing data bases into DAD is done by using wellknown copy/paste facilities. Before doing this, however, a new data base must be created in DAD and then assigned a number of observations (or size) that corresponds to the length of the variables that will be copied/pasted.
The third possibility for entering data into DAD is typically more reliable (and also faster) than the first two and involves two steps. The first step saves the database in an ASCII (or a text) format. The way in which this is done in practice depends on the software in which the data were previously handled. DAD's Users Manual gives examples of such output procedures for several common commercial softwares. One fast alternative to this is offered by the use of STAT/TRANSFER (note however that this requires buying a license), which transforms databases rapidly from the most popular formats into an ASCII format. Once the database is in ASCII format, it can easily be imported using DAD's Data Import Wizard. The wizard ensures inter alia that the imported database does not contain missing or unreadable values. Once the data are read in DAD, they can be submitted to a number of arithmetical and logical operations, variable names can be added or changed, and new variables can be created. Databases can subsequently be saved in DAD's preferred ASCII format (identified by the extension .daf).
As already mentioned, many of DAD's applications can use simultaneously two databases. To use a second database, the user should first activate a second file by clicking on the button File2, and then follow the same procedures as for loading a first file.
The process of generating random surveys usually displays four important characteristics (this is discussed in more details in Chapter 16):
the base of sampling units (the base from which sample observations are drawn) is stratified;
sampling is multistaged, generating clusters of observations;
observations come with sampling weights, also called inverse probability weights;
observations may have been drawn with or without replacement;
observations often provide aggregate information on a number of units of interest (such as the different individuals that live in a household).
Recent versions of DAD enable taking that structure into account for the estimation of the various distributive statistics as well as for the computation of the sampling distributions of these statistics.
When a data file is first read or typed into DAD, the survey design assigned to it by default is Simple Random Sampling. This supposes that the observations were independently selected from a large base of sampling units. This, however, is rarely how surveys are designed and implemented. Once the data are loaded, the exact sampling design structure can however be easily specified. This is done using the Set Sample Design dialogue box. Specifying the sample design structure can involve letting DAD know about (up to) 5 vectors (see Figure 14.2).
STRATA: this specifies the name of the variable (in an integer format) that contains the Stratum identifiers.
PSU: this specifies the name of the variable (in an integer format) that contains the identifiers for the Primary Sampling Units.
LSU: this specifies the name of the variable (in an integer format) that contains the identifiers for the Last Sampling Units.
SAMPLING WEIGHT: this specifies the name of the sampling weights variable.
CORRECTION FACTOR: this provides DAD with a Finite Population Correction variable.
Once data have been read into DAD and that the sampling design has been specified, the field is wide open for the estimation of distributive statistics and for performing statistical tests. For every application programmed in DAD, there is a specific application window that facilitates the specification of variables, parameters and options to generate the desired distributive statistics. For example, Figure 14.3 shows the specific application window for computing the FGT poverty index with one distribution. There is a separate specific window for the case of two distributions. The list of all applications available in DAD's current version 4.4 appears in Tables 14.1 and 14.2.
Most application windows, including that of Figure 14.3, are divided into three panels. The first panel is used to specify the relevant database variables needed for the estimation. The second panel (generally at the bottom of the application window) specifies the parameter values and options to be used by the estimator —examples include the level of inequality aversion, the value of the poverty line, the percentile to be considered, as well as whether indices should be normalized and whether statistical inference should be performed. The third panel activates buttons to generate various types of results. Some application windows can also generate poppingup dialogue boxes. One example of this can be found when clicking on the Compute line button in the Poverty FGT application window. This serves to specify the manner in which the poverty line should be (or was) estimated.
The following basic variables are typically required for carrying out DAD's computations.
VARIABLE OF INTEREST. This is the variable that usually captures living standards. It can represent, for instance, per capital total household income, expenditures per adult equivalent, calorie intake, heightforage scores for children, etc..
SIZE VARIABLE. This refers to the "ethical" of physical size of the observation. For the computation of many distributive statistics, we will indeed wish to take into account how many relevant individuals (or statistical units) are found in a given observation. We might, for instance, wish to estimate inequality across individuals, the proportion of children who are poor, or the concentration of pension benefits among pensioners. Individuals, children and pensioners will then respectively be the statistical units of interest. Households do differ, however, in their size or in the number of children they contain. DAD takes this into account through the use of the SIZE VARIABLE. When an observation represents a household, computing inequality across individuals requires specifying household size as the SIZE VARIABLE, whereas computing poverty among children requires putting the number of children in the household as the SIZE VARIABLE. If the statistics of interest were the proportion of households in poverty, then no SIZE VARIABLE would be needed.
GROUP VARIABLE. (This should be used in combination with GROUP NUMBER.) It is often useful to limit some distributive analysis to some population subgroups. We might for example wish to estimate poverty within a country's rural area or within the group of public workers. One way to do this is to set SIZE VARIABLE to zero for all of the observations that fall outside these groups of interest. Another way is by defining a GROUP VARIABLE whose values will allow DAD to identify which are the observations of interest.
GROUP NUMBER. GROUP NUMBER tells DAD on which value of the GROUP VARIABLE to condition the computation of some distributive statistics. The value for GROUP NUMBER should be an integer. For example, rural households might be assigned a value of 1 for some variable denoted as region. Setting GROUP VARIABLE to region and GROUP NUMBER to 1 then makes DAD know that we wish the distributive statistics to be computed only within the group the rural households.
SAMPLING WEIGHTS. Sampling weights are the inverse of the sampling rate. They are best specified once and for all using the Set Sample Design window (as discussed above). Distributive statistics (but not necessarily their sampling distribution and standard errors) will be left unchanged, however, if no variable is given for Sampling Weight (in Set Sample Design window) and if the product of the sampling weight and size variables is subsequently specified as the SIZE VARIABLE in the relevant application windows.
DAD's applications with two distributions can be launched after having loaded two databases. Each time one launches an application that can support two distributions, the dialog box, shown in Figure 14.4, opens to allow the user to specify the desired number of distributions to be used as well as the name of the databases for these distributions. The application window for two distributions is very similar to that for one. The main difference is the addition of a second panel to specify the relevant variables to be used for the second distribution. The application for two distributions generally serves to compute distributive differences across the two distributions. For curve applications with two distributions, for instance, differences between the curves of the two distributions can usually be drawn.
DAD has builtin tools that facilitate the use of curves to display distributive information. Say, for instance, that we wish to graph a Lorenz curve. We can compare it to the 45° line to observe by how much income shares differ from population shares. This is done by following these steps:
From the main menu, select the submenu: CurveLorenz. Indicate that the number of distributions equals one.
After choosing the application variables, click on the button Graph to draw the first Lorenz curve.
If you would like to draw another Lorenz curve for another variable of interest, return to the Lorenz application window, reinitialize the variable of interest and click again on the button Graph.
Main menu 
Applications 
Inequality 
Atkinson Index 
Polarisation 
Wolfson Index 
Poverty 
FGT Index 
Dominance 
Poverty Dominance 
Welfare 
Atkinson Index 
Main menu 
Applications 
Decomposition 
FGT: Decomposition by groups 
Redistribution 
Tax or Transfer 
Curves 
Lorenz 
Distribution 
Density Function 
When the graph window appears, click on the button Draw all to plot all of the curves.
If you wish to draw the 45° line, select (from the main menu of the graph window) ToolsProperties, and activate the option DRAW THE 45° LINE.
Figure 14.5 shows an example of Lorenz curves drawn by DAD.
We can also compare two Lorenz curves to test for inequality dominance of one distribution over the other. For this, we choose again the application CurvesLorenz, but this time with two distributions.
DAD can also usually draw curves that show how the levels of some distributive statistics vary with ethical parameters — Such as inequality or poverty aversion parameters. Take for instance the Atkinson index of inequality. It may be informative to check how fast it varies as a function of ε, its parameter of inequality aversion. To do this, follow these steps:
From the main menu, select the submenu: InequalityAtkinson. Indicate that the number of distribution equals one.
After setting the application variables, click on the button Range and specify the desired range for the parameter ε.
Click on the button Graph to draw the curve that shows the Atkinson index against the inequality aversion parameter ε.
When the graph window appears, click on the button Draw to plot the curve.
Recent versions of DAD are quite flexible in terms of editing, saving and printing graphs. On most application windows, a button Graph is available to draw graphs instantly. The type of graphs drawn depends on the application and on the type of Graph buttons selected. There are for instance two Graph buttons in the PovertyFGT Index application window. Clicking on the Graph button plots estimates of the FGT index for a range of alternative poverty lines. Clicking on the Graph2 button draws instead estimates of the equallydistributed poverty gap that is equivalent to the estimated FGT poverty index, and this for a range of poverty aversion parameters α.
Most of the options for editing DAD's graphs can be accessed from the Graph Properties dialogue box — see Figure 14.7. DAD's graphs can also be saved in a variety of formats. Table 14.3 lists some of them.
Curves are useful tools to check various types of distributive dominance. Table 14.4 sums up some of the links between some of the applications and curves found in DAD and the tests for various orders of social welfare, poverty and inequality dominance.
Extension 
Description 
*.pmb 
Bitmat Image file 
*.png 
Portable Network Graphic 
*.pmb 
Bitmat Image file 
*.tif 
Tag Image File Format 
*.jpg 
JPEG File Interchange Format 
Portable Document Format  
*.ps 
Postscript 

Primal approach 
Dual approach  
Order 
Social welfare 
 
1 
DistributionDistribution function 
CurvesQuantile  
2 
DominancePoverty Dominance s = 2 
CurvesGeneralized Lorenz  
s 
DominancePoverty Dominance 
 

Poverty 
 
1 
DominancePoverty Dominance s = 1 
CurvesPoverty Gap  
2 
DominancePoverty Dominance s = 2 
CurvesCPG  
s 
DominancePoverty Dominance 
 

Inequality 
 
1 
DominanceInequality Dominance s = 1 
CurvesNormalized Quantile  
2 
DominanceInequality Dominance s = 2 
CurvesLorenz  
s 
DominanceInequality Dominance 

DAD facilitates statistical inference in a number of original ways:
DAD readily provides asymptotic standard errors on a large number of estimators of distributive statistics, including estimators of inequality and social welfare indices, normalized/unnormalized poverty indices, poverty indices with deterministic/estimated poverty lines, poverty indices with absolute/relative poverty lines, equallydistributedequivalent incomes and poverty gaps, quantiles, density functions, nonparametric regressions, points on a large number of curves, crossing points of curves, critical poverty lines, differences in indices and curves, ratios of various statistics, various income/price/population impacts and elasticities, distributive decompositions into demographic/factor components, progressivity, redistribution and equity indices, dominance statistics, etc.. It can be (and has typically formally been) shown that all of these estimators are asymptotically normally distributed.
DAD can calculate the sampling distribution of most of these estimators taking into account the sometimes complex design of the surveys. This is done as indicated in Section 14.3. Existing (commercial) softwares can sometimes take this design into account, but only for a sample number of relatively simple distributive statistics (such as simple sums and ratios).
DAD can provide at the click of a button estimates of confidence intervals as well as test statistics and pvalues for various symmetric and asymmetric hypothesis tests of interest.
DAD can be used to simulate numerically the finitesample sampling distribution of most of the abovementioned estimators using bootstrap procedures. The bootstrap can be performed on the ordinary estimators or on (asymptotically) pivotal transforms of them. It is well known that bootstrapping on pivotal statistics leads to faster rates of convergence to the true sampling distribution than bootstrapping on untransformed nonpivotal statistics. Pivotal bootstrapping is, however, usually more costly in time and resources since it requires estimates of the asymptotic distribution of the estimators. This is not a problem for DAD, however, since the (sometimes complex) asymptotic standard errors of these estimators are already programmed into it. Moreover, as mentioned above, the asymptotic standard errors and the pivotal statistics derived from them can be sampledesign corrected, providing one more degree of superior accuracy for the bootstrap procedures available in DAD.
The Standard deviation, confidence interval and hypothesis testing dialogue box is the main tool for telling DAD what to do in terms of statistical inference. This box is shown on Figure 14.8.
For further information on Java's development and structure, see (Deitel and Deitel 2003)'s introductory book, or Chapter 1 of (Lewis and Loftus 2000). DAD's official web page provides access to extensive information on the software:
This page intentionally left blank.
It is often useful to visualize the shapes of income distributions. There are essentially two main approaches to doing so, and a mixture of the two. The first approach uses parametric models of income distributions. These models assume that the income distribution follows a known particular functional form, but with unknown parameters. Popular examples of such functional forms include the lognormal, the Pareto, and variants of the beta or gamma distributions. The main statistical challenge is then to estimate the unknown parameters of that functional form, and to test whether a given functional form appears to estimate better the observed distribution of income than another functional form.
The second approach does not posit a particular functional form and does not require the estimation of functional parameters. Instead, it lets the data entirely "speak for themselves". It is therefore said to be nonparametric. The method is most easily understood by starting with a review of the density estimation used by traditional histograms. Histograms provide an estimate of the density of a variable y by counting how many observations fall into "bins", and by dividing that number by the width of the bin times the number of observations in the sample. To see this more clearly, denote the origin of the bins by y0 and the bins of the histogram by [y0 + mh, y0 + (m + l)h] for positive or negative integers m. For instance, if we take m = 0, then the bin is described by the interval ranging from the origin to the origin plus h. Also, let be a sample of n observations of income yi. The value of the histogram over each of the bins is then defined by
Such a histogram is shown on Figure 15.1 by the rectangles of varying heights over identical widths, starting with origin y0. For bins defined by [y0+mh, y0+ (m + 1)h], the bin width is indeed a constant set to h, but we can also allow the widths to vary across the bins of the histogram. The choice of h controls the amount of smoothing performed by the histogram. A small bin width will h lead to significant fluctuations in the value of the histogram, and a very large width will set the histogram to the constant h1. Choosing an appropriate value for such a smoothing parameter is in fact a pervasive preoccupation in nonparametric estimation procedures, as we will discuss later. The choice of the origin can also be important, especially when n is not very large. There can be, however, little guidance on that latter choice, except perhaps when the nature of the data suggest a natural value for y0. One way to avoid choosing such a y0 is by constructing what will appear soon to be a "naive" kernel density estimator, that is, one in which the point y in is always at the center of the bin:
This naive estimator can also be obtained from the use of a weight function w(u), defined as:
and by defining
This frees the density estimation from the choice of y0. This naive estimator can also be improved statistically by choosing weighting functions that are smoother than w(u) in 15.3. For this, we can think of replacing the weight function w(u) by a general "kernel function" K(u), such that1
A smooth kernel estimate of the density function that generated the histogram is shown on Figure 15.1.
1DAD: Distribution Density Function.
In general, we would wish , since we would then have . For to qualify fully as a probability density function, we would also require K(u) ≥ 0 since we would then be guaranteed that ≥ 0, although there are sometimes reasons to allow for negativity of the kernel function, h is usually referred to as the window width, the bandwidth or the smoothing parameter of kernel estimation procedures. There are also arguments to adjust the window width that applies to observation yi for the number of observations that surround yi, making h larger for areas where there are fewer observations. This is done for instance by the nearest neighbor and the adaptive kernel methods. As in the use of the naive density estimator, each observation will provide a box or a "bump" to the density estimation of f(y), and that bump will have a shape and a width determined by the shape of K(u) and the size of h respectively2.
E: 18.5.1
The definition of in (15.5) makes it inherit the continuity and differentiability properties of K(u). It is often sound and convenient to choose a kernel function that is symmetric around 0, with and . One such kernel function that has nice continuity and differentiability properties is the Gaussian kernel, defined by
The "bumps" provided by the Gaussian kernel have the familiar bell shapes, are smoothly differentiable up to any desired level, and are such that .
The efficiency of nonparametric estimation procedures is usually measured by the mean square error (MSE) that there is in estimating the function f(y) at a point y. The MSE in estimating f(y) by is defined by
The most common way of defining a measure of global accuracy simply sums the mean square error across values of y. This yields the mean integrated square error (or MISE), a measure of the accuracy of estimating f(y) over the whole range of y:
2DAD: Distribution Density Function.
The relative efficiency of a particular choice of a kernel function K(u) can then be assessed relative to that choice of the kernel function which would minimize the MISE. The Gaussian kernel function has very good efficiency properties, although they are not quite as good as some other (less smooth) kernel functions, such as the (efficiencyoptimal) Epanechnikov, the biweight or the triangular kernels, which are described and discussed for instance in Silverman (1986) (see in particular Table 3.1).
Even, however, if we were to agree on a particular shape for an argumentcentered kernel function, there would still remain the question of which window width to choose. Again, conditional on the choice of a particular form for K(u), we can choose the window width that minimizes the MISE. To see what this implies, note first that we can decompose the MSE at y as a sum of the square of the bias and of the variance that there is in estimating :
For symmetric kernel functions, the bias can be shown to be approximately equal to
where, as before, f(i)(y) stands for the ith order derivative of f(y). The variance equals
where . Substituting (15.10) and (15.11)in (15.9) then gives:
Hence, considering (15.10), we find that the bias of will be low if the kernel function has a low variance, since it is then the observations that are "closer" to y that will count more, and since it is those observations that provide the least biased estimate of the density at y. But the bias also depends on the curvature of f(y): in the absence of such a curvature, the density function is linear and the bias provided by using observations on the left of y is just (locally) outweighed by the bias provided by using observations on the right of y. When f(2)(y) = 0, therefore, there is asymptotically no bias in using kernel density estimation.
Looking at (15.11), we find ceteris paribus that a flatter kernel (i.e., with a lower ck) decreases the variance of . A flatter kernel weights more equally the observations found around y, and that reduces the variance of an estimator such as (15.5). We also obtain the familiar result that the variance of the estimator decreases proportionately with the size of the sample.
An increase in h plays an offsetting role on the precision of , as is shown by (15.12). When f(2)(y) ≠ 0, a large h increases the bias by making the estimators too smooth: too much use is made of those observations that are not so close to y. Conversely, a large h reduces the variance of by making it less variable and less dependent on the particular value of those observations that are very close to y. Hence, in choosing h in an attempt to minimize MISE , a compromise needs to be struck between the competing virtues of bias and variance reductions. The precise nature of this compromise will depend on the shape of the kernel function as well as on the true population density function. For instance, if the Gaussian kernel is used and if the true density function is normal with variance σ2, then the choice of h that minimizes the MISE is given by (see for instance Silverman 1986, p.45):
This value of h* is conditional on both K(u) and f(y) being normal density functions. Silverman (1986) also argues for a more robust choice of h*, given by
where A = min(standard deviation, interquartile range/1.34). This is because (15.14)
(...) will yield a mean integrated square error within 10% of the optimum for all the tdistributions considered, for the lognormal with skewness up to about 1.8, and for the normal mixture with separation up to 3 standard deviations. (...) For many purposes it will certainly be an adequate choice of window width, and for others it will be a good starting point for subsequent fine tuning. (Silverman 1986, p.48)
Further (asymptotic) results show that, under some mild assumptions — in particular, that the density function f(y) is continuous at y, and that h → 0 and nh → ∞ as n → ∞ — the kernel estimator converges to f(y) as n → ∞. When h is chosen optimally, it is of the order of n1/5, and by (15.12) the MISE is then of the order of n0.4. This is slightly lower than the analogous usual rate of convergence of parametric estimators, which is n0.5.
Kernel estimation can also be used for multivariate density estimation. Let u, y and yi be ddimensional vectors. We can estimate a ddimensional density function as3:
where h is a window width common to all of the dimensions. The multivariate Gaussian kernel is given by . The issues of kernel function and window width selections are similar to those discussed above for univariate density estimation. The approximately optimal window converges at the rate and the optimal window width for the Gaussian kernel and a multivariate normal density f(y) with unit variance is given by .
Simulations from an estimated density are sometimes needed to compute estimates of functionals of the unknown true density function. This is the case, for instance, for the estimation in DAD of indices of classical horizontal inequity. The estimation of such indices requires information on the net income distribution of those who have the same gross incomes, and such information cannot be gathered directly from sample observations of net and gross incomes since very few (if any) exact equals can be observed in random samples of finite sizes. Another use of simulated distributions is for computing bootstrap estimates of the sampling distribution of some estimators. The usual bootstrap procedure proceeds by conducting successive random sampling (with replacement) from the original sample . This constrains the new samples to contain only those observations yi that were contained in the original sample. Those new samples could instead be generated from a nonparametric estimate of the density of the original sample of incomes, which would yield a bootstrap estimate that would be smoother and less dependent on the precise values that the observations Yi took in the original sample.
Consider first the case of generating J independent realizations, , in a univariate case, and suppose that a nonnegative kernel function K(u) with window width h is used to estimate f(y). Also assume that observation i has sampling weight wi, and suppose for simplicity that the initial observations were drawn independently from each other. The following simple
3DAD: DistributionJoint Density Function.
algorithm is adapted slightly from Silverman (1986), p.143. For j = 1, . . ., J, we then:
Step 1 Choose i with replacement from with probability
Step 2 Choose ε randomly using the probability density function K;
Step 3 Set = yi + hε.
Note that this algorithm does not even require computing directly .
For the multivariate case, the above algorithm becomes just slightly more complicated. For instance, for the estimation of classical HI at gross income x, we need to generate a random sample of net incomes, , that follows the estimated kernel conditional density . For this, we use the original sample with sampling weights wi. For j = 1, . . ., J, we then:
Step 1 Choose i with replacement from with probability
Step 2 Choose ε randomly using the probability density function K;
Step 3 Set = yi + hε.
This gives a simulated sample of net incomes , conditional upon gross income being exactly equal to x. A local index of classical HI at x can then be computed using this simulated sample, and global indices of classical HI can be estimated simply by repeating this procedure for each of the observed values of gross incomes, .
Because they follow an estimated density function that is on average smoother than the true one, the simulated samples generated by the above algorithms will have a variance that is generally larger than both the variance observed in the sample and the true population variance. Let for instance the sample variance of the yi be denoted as . In the univariate case, the variance of the simulated will equal . This can be a problem if, as is the case for the measurement of indices of classical HI, the quantity of interest is intimately linked to the dispersion of income. There may also be a wish to constrain the simulated samples of net incomes to have precisely the same sample mean, y, as the original sample. Constraining the simulated samples to have the same mean and variance as the original sample can be done by translating and rescaling the simulated samples. This involves replacing Step 3 above by
Step 3' Set
in the univariate case. For the bivariate case, we also use Step 3', but replace y by and by which can be respectively computed as:
and
Equation (15.16) is in fact an example of a kernel regression of y on x, a procedure to which we now turn.
The estimation of an expected relationship between variables is the second most important sphere of recent applications of kernel estimation techniques. Nonparametric regressions offer several useful applications in distributive analysis. An example of such an application is the estimation of the relationship between expenditures and calorie intake. Regressing calorie intake non parametrically on expenditure does not impose a fixed functional relationship between those two variables along the entire range of calorie intake. On the contrary, it allows a fair amount of flexibility by estimating the link between the two variables through a local weighting procedure. The local weighting procedure essentially considers the expenditures of those individuals with a calorie intake in the "region" of the specified calorie intake. It weights those values with weights that decrease rapidly with the distance from the calorie intake. Hence, those with calorie intake far from the specified level will contribute little to the estimation of the expenditure needed to attain that level. The results using this method are thus less affected by the presence of "outliers" in the distribution of incomes, and less prone to biases stemming from an incorrect specification of the link between spending and calorie intake.
Basically, then, one is interested in estimating the predicted response, m(x), of a variable y at a given value of a (possibly multivariate) variable x, that is,
Alternatively, if the joint density f(x, y) exists and if f(x) > 0, m(x) can also be defined as:
The difficulty in estimating the function m(x) is that we typically do not observe in a sample a response of y at that particular value of x. Furthermore, even if we do, there are rarely other observations with exactly the same value of x that will allow us to compute reliably the expected response in which we are interested.
Let then be a sample of n observed realizations jointly of x and y. The response information that is provided by the sample can be expressed as:
To estimate m(x), kernel regression techniques use a local averaging procedure that involves weights K(u) that are analogous to those used in Section 15.1 for density estimation. Recalling (15.5) and (15.19), this leads to the following NadarayaWatson nonparametric estimator of m(x)4:
E:18.8.1
To reduce the bias of using neighboring yi's, the kernel weights are typically inversely proportional to the distance between x and xi. They also depend on the window width h.
As in the case of the kernel density estimators, the kernel smoother can be shown to be consistent under relatively weak conditions, including that m(x) and f(x) are both continuous functions of x, and that h 0 and nh ∞ as n ∞ (see for instance Härdle 1990, Proposition 3.1.1). Again, the variance of alone does not fully capture the convergence of to m(x) since we must also take into account the bias of , which comes from the smoothing of the yi in (15.21). Under suitable regularity conditions, including that h ~ n0.2, the asymptotic distribution of the kernel estimator can be shown to be normal, with its center shifted by its asymptotic bias — see Härdle (1990), Theorem 4.2.1, for a demonstration. This asymptotic bias is a function of the form of the kernel K(u) and of the derivatives of m(x) and f(x). It is given by:
This asymptotic bias can be estimated consistently using estimates of m(2)(x), m(1)(x), f(1)(x) and f(x). Such an estimation, however, complicates significantly the computation of the sampling distribution of , and it can be
4DAD: DistributionNonParametric Regression.
avoided if we can expect (or can make) the bias to be small compared to the variance. This will be the case if m(x) is relatively constant, or if we make h fall just a bit faster than its optimal speed of n0.2 — again, see the discussion of this in Härdle (1990), pp.100102.
The variance of is given by:
The conditional variance can be estimated consistently as in (15.17). In the case of kernel density estimation, note again that the smoothing process makes the rate of convergence of the kernel estimator to be n0.4 instead of the usual slightly faster parametric convergence rate of n0.5.
This chapter draws significantly from Silverman (1986) and Härdle (1990), to which readers are referred for more details and indepth analysis.
This page intentionally left blank.
There exist in the population of interest a number of statistical units. For simplicity, we can think of these units as households or individuals. From an ethical perspective, it is usually preferable to consider individuals as statistical units of interest since it is in the welfare of individuals that we are ultimately interested, but for some purposes (such as the distribution of aggregate household wellbeing) households may also be appropriate statistical units.
These statistical units are those for which we would like to observe socioeconomic information such as their household composition, labor activity, income or consumption. Since it is usually too costly to gather information on all of the statistical units of a large population, one would typically be constrained to obtain information on only a sample of such units. Distributive analysis is therefore usually done using survey data.
Since surveys are not censuses, we must take care to distinguish unobserved "true" population values from observed sample values. Sample differences across surveys are indeed due both to true population differences and to sampling variability. Population values are generally not observed (otherwise, we would not need surveys). Sample values as such are rarely of interest: they would be of interest in themselves only if the statistical units that appeared by chance in a sample were also precisely those which were of ethical interest. This is not usually the case. Hence, sample values matter in as much as they can help infer true population values. The statistical process by which such inference is performed is called statistical inference. The sampling process should thus ideally be such that it can be used to make some statisticallysensible distributive analysis at the level of the population, and not solely for the samples drawn.
Sampling errors thus arise because distributive estimates are typically made on the basis of only some of the statistical units of interest in a population. The fact that we have no information on some of the population statistical units makes us infer with sampling error the population value of the distributive indicators in which we are interested. The error made when relying solely on the information content of one sample depends on the statistical units present in that sample. The drawing of other samples would generate different sampling errors. Because samples are drawn randomly, the sampling errors that arise from the use of these samples are also random.
Since the true population values are unknown, the sampling error associated with the use of a given sample is thus also unknown. Statistical theory does, however, allow one to estimate the distribution of sampling errors from which actual (but unobserved) sampling errors arise. This nevertheless requires samples to be probabilistic, viz, that there be a known probability distribution associated to the distribution of statistical units in a sample. This also strictly means that there is absence of unquantifiable and subjective criteria in the choice of units. If this were not so, it would not be possible to assess reliably the sampling distribution of the estimators.
To draw a sample, a sampling base is used. A sampling base is made of all the sampling units (SU) from which a sample can be drawn. The base of sampling units — e.g., the census of all households within in a country — is usually different from the entire population of statistical units — e.g., the population of individuals, say. There are several reasons for this, an important one being that it is generally cost effective to seek information only within a limited number of clusters of statistical units, grouped geographically or socioeconomically. This also facilitates the collection of clusterlevel (e.g., villagelevel) information.
A process of simple random sampling draws sample observations randomly and independently from a base of sampling units, each with an equal probability of selection. Simple random sampling is rarely used in practice to generate household surveys. Instead, a population of interest (a country, say) is often first divided into geographical or administrative zones and areas, called strata. The first stage of random selection takes place from within a list of primary sampling units (denoted as PSU's) built for each stratum. Within each stratum, a number of PSU's are then randomly selected. PSU's are often departments, villages, etc.. This random selection of PSU's provides "clusters" of information.
Since the cost of surveying all statistical units un each of these clusters may be prohibitive, it may be necessary to proceed to further stages of random selection within each selected PSU. For instance, within each department, a number of villages may be randomly selected, and within every selected village, a number of households may also be randomly selected. The final stage of random selection is done at the level of the last sampling units (LSU's). These LSU's are often households. Each selected LSU can then provide information on all individuals found within that LSU. These individuals are usually not selected — information on all of them appears in the sample. They therefore do not represent LSU's in statistical terminology.
Sampling weights (also called inverse probability, expansion or inflation factors) are the inverse of the sampling probabilities, viz, of the probabilities of a sampling unit appearing in the sample. These sampling weights are SUspecific. The sum of these weights is an estimator of the size of the population of SU's.
Samples are sometimes "selfweighted". Each sampling unit then has the same chance of being included in the survey. This arises, for instance, when the number of clusters selected in each stratum is proportional to the size of each stratum, when the clusters are randomly selected with probability proportional to their size, and when an identical number of households (or LSU's) across clusters is then selected with equal probability within each cluster.
It is, however, common for the inclusion probability to differ across households. One reason comes simply from the complexity of sample designs, which makes differential sampling weights occur frequently. Another reason is that the costs of surveying SU's vary, which makes it more cost effective to survey some households (e.g., urban ones) than others. Sampling precision can also be enhanced with differential probabilities of household inclusion. The idea here is to survey with greater probability those households who contribute more to the phenomenon of interest. It leads to a sampling process usually called sampling with "probability proportional to size".
Assume for instance that we are interested in estimating the value of a distributionsensitive poverty index. The most important contributors to that index are obviously the poor households, and more precisely the poorest among them. An a priori suspicion might be that such poorest households are proportionately more likely to be found in some areas than in others. Making inclusion probabilities larger for households in these more deprived areas will then enhance the sampling precision of the estimator of the distributionsensitive poverty index since it will gather data that are more statistically informative.
A reverse sampledesign argument would apply for a survey intended to estimate total income in a population. The most important contributors to total income are the richest households, and it would thus be sensible to sample them with a greater probability. Yet one more consequence of the principle of "probability proportional to size" is the desirability of sampling with greater probability those households of larger sizes. Distributive analysis is normally concerned with the distribution of individual wellbeing. Ceteris paribus, largersize households contribute more information towards such assessment, and should therefore be sampled with a greater probability (roughly speaking, with a probability proportional to their size).
Omitting sampling weights in distributive analysis will systematically bias both the estimators of the values of indices and points on curves as well as the estimation of the sampling variance of these estimators1. Including such weights will usually make the analysis free of asymptotic biases. To see this, we follow Deaton (1998), p.45, and let Y be the population total of the x's, with a population of size N. An estimator of that population total is then given by
E:18.2.1
where ti is the number of times unit i appears in a random sample of size n and where wi is the sampling weight. Let πi be the probability that unit i is selected each time an observation is drawn from the population. Households with a low value of πi will have a low probability of being selected in the survey, relative to others with a higher πi. Then, is the expected number of times unit i will appear in the sample, or, for large n, it is roughly speaking the probability of being in the sample. Hence,
and Y is therefore an unbiased estimator of Y. An analogous argument applies to show that is an unbiased estimator of population size N.
The sampling base is usually stratified into a number of strata. The basic advantage of stratification is to use prior information on the distribution of the population, and to "partition" it in parts that are thought to differ significantly from each other. Sampling then draws information systematically from each of those parts of the population. With stratification, no part of the sampling base therefore goes totally unrepresented in the final sample.
To be more specific, a variable of interest, such as household per capita income, often tends to be less variable within some stratum than across an entire population. This is because households within the same stratum typically share to a greater extent than within the entire population some socioeconomic characteristics — such as geographical locations, climatic conditions, and demographic
1DAD: Poverty FGT Index.
characteristics — that are determinants of the incomes of these households, stratification helps generate systematic sample information from a diversity of "socioeconomic areas".
Because information from a "broader" spectrum of the population leads on average to more precise estimates, stratification generally decreases the sampling variance of estimators. For instance, suppose at the extreme that household income is the same for all households in a given stratum, and this, for each and every stratum. In this case, supposing also that the population size of each stratum is known in advance, it would be sufficient to draw only one household from each stratum to know exactly the distribution of income in the population.
Multistage sampling implies that SU's end up in a sample only subsequently to a process of multistage selection. Groups (or "clusters") of SU's are first randomly selected within a population (which may be stratified). This is followed by further sampling within the selected groups, and followed by yet another process of random selection within the subgroups just selected.
The first stage of random selection is done at the level of primary sampling units (PSU). An important condition would seem to be that firststage sampling be random and with replacement for the selection of a PSU to be done independently from that of another. There are many cases, however, in which this condition is not met.
1 Firststage sampling is typically made without replacement.
This will not matter in practice for the estimation of the sampling variance if there is multistage sampling, that is, if there is an additional stage of sampling within each selected PSU. The intuitive reason is that selecting a PSU only reveals random and incomplete information on the population of statistical units within that PSU, since not all of these statistical units appear in the sample when their PSU is selected. Selecting that same PSU once more (in a process of firststage sampling with replacement) does therefore reveal additional information, information different from that provided by the firsttime selection of that PSU. This extra information is roughly of equal value to that which would have been revealed if a process of sampling without replacement had forced the selection of a different PSU.
Hence, in the case of multistage sampling, firststage sampling without replacement does not extract significantly more information than firststage sampling with replacement. It does not therefore practically lead to less variable estimators than a process of firststage sampling with replacement.
If, however, there is no further sampling after the initial selection of PSU's, then a finite population correction (FPC) factor should be used in the computation of the sampling variance. This would generate a better estimate of the true sampling variance. If FPC factors are not used, then the sampling variance of estimators will tend to be overestimated. This means that it will be more difficult to establish statistically significant differences across distributive estimates, making the distributive analysis more conservative and less informative than it could be.
2 Sampling is often systematic.
Systematic sampling can be done in various ways. For instance, a complete list of N sampling units is gathered. Letting n be the number of sampling units that are to be drawn, a "step" s is defined as s = N/n. A first sampling unit is randomly chosen within the first s units of the sampling list. Let the rank of that first unit be . The n  1 subsequent units with ranks k + s, k + 2s, k + 3s,..., k + ns then complete the sample.
If the order in which the sampling units appear in the sampling list is random, then such systematic sampling is equivalent to pure random sampling. If, however, this is not the case, then the effect of such systematic sampling on the sampling variance of the subsequent distributive estimators depends on how the sampling units were ordered in the sampling list in the first place.
(a) For instance, a "cyclical" ordering makes sampling units appear in cycles. "Similar" sampling units then show up in the sampling list at roughly fixed intervals. Suppose for illustrative purposes that the size of these intervals is the same as s. Then, systematic sampling will lead to a gathering of information on similar units (e.g., with similar incomes), thus reducing the statistical information that is extracted from the sample. This will reduce the sampling precision of estimators, and increase their sampling variance.
(b) A cyclical ordering of sampling units suggests that there is more samplingunit heterogeneity around a given sampling unit than across the whole sampling base (since information around sampling units is simply cyclically repeated across the sampling base). A more frequent phenomenon arises when adjacent sampling units show less heterogeneity than that shown by the entire sampling base. A typical occurrence of this is when sampling units are ordered geographically in a sampling list. Households living close to each other appear close to each other in the list. Villages far away from each other are also far away in the sampling list. Since geographic proximity is often associated with socioeconomic resemblance, the farther from each other in the list are sampling units, the more likely will they also differ in socioeconomic characteristics.
Systematic sampling will then force units from across the entire sampling list to appear in the sample. Representation from implicit strata will thus be compelled into the sample. This will lead to a sampling feature usually called implicit stratification. Pure random sampling from the sampling list will not force such a systematic extraction of information, and will therefore lead to more variable estimators.
By how far implicit stratification reduces sampling variability depends on the degree of betweenstratum heterogeneity which stratification allows to extract, just as for explicit stratification. The larger the heterogeneity of units far from each other, the larger the fall in the sampling variability induced by the systematic sampling's implicit stratification.
One way to account for and to detect the impact of implicit stratification in the estimation of sampling variances is to group pairs of adjacent sampling units into implicit strata. Assume again that n sampling units are selected systematically from a sampling list. Then, create n/2 implicit strata and compute sampling variances as if these were explicit strata. If these pairs did not really constitute implicit strata (because, say, the ordering in the sampling list had in fact been established randomly), then this procedure will not affect much the resulting estimate of the sampling variance. But if systematic sampling did lead to implicit stratification, then the pairing of adjacent sampling units will reduce the estimate of the sampling variance — since the variability within each implicit stratum will be found to be systematically lower than the variability across all selected sampling units.
Generally, variables of interest (such as incomes) vary less within a cluster than between clusters. Hence, ceteris paribus, multistage selection reduces the "diversity" of information generated compared to SRS and leads to a less informative coverage of the population. The impact of clustering sample observations is therefore to tend to decrease the precision of estimators, and thus to increase their sampling variance. Ceteris paribus, the lower the withincluster variability of a variable of interest, the smaller the gain of information that there is in sampling further within the same clusters.
To see this, suppose the extreme case in which household income happens to be the same for all households in a cluster, and this, for all clusters. In such cases, it is clearly wasteful to adopt multistage sampling: it would be sufficient to draw one household from each cluster in order to know the distribution of income within that cluster. More information would be gained from sampling from other clusters.
There are two modelling approaches to thinking about how data were initially generated. The first one, which is also the more traditional in the sampling design literature, is the finite population approach. The second approach is the superpopulation one: the actual population is a sample drawn from all possible populations, the infinite superpopulation. This second approach sometimes presents analytical advantages, and it is therefore also regularly used in econometrics.
To illustrate the impact of stratification and clustering on sampling variability, consider therefore the following "superpopulation model", based on Deaton (1998), p.56. Then, the income xhij of a household j from a cluster i of a stratum h can be modelled as:
For simplicity, assume that the xhij are drawn from the same number n of clusters in each of the L strata, and that the same number of LSU (or "households") m is selected in each of the clusters. The indices hij then stand for:
h = 1,...,L: stratum h
i = 1,...,n: cluster i (in stratum h)
j = 1,...,m: household j (in cluster i of stratum h).
For simplicity, also assume that αh, is distributed with mean 0 and variance , that is distributed with mean 0 and variance , and that εhij is distributed with mean 0 and variance . Assume moreover that these three random terms are distributed independently from each other.
Say that we wish to estimate mean income μ. The estimator, , is given by
Let
be the estimator of the mean μh of stratum h. Clearly, and since by (16.4) and (16.5)
and
Because of the independence of sampling across strata, we also have that
The sampling variability of is thus a simple average of the sampling variances of the L strata's .
Stratification can in fact be thought of as an extreme case of clustering, with the number of selected clusters corresponding to the number of population clusters, and with sampling being done without replacement to ensure that all population clusters will appear in the sample. Suppose instead that one were to select L strata randomly and with replacement, to make it possible that not all of the strata will be selected. This is in a sense what happens when stratification is dropped and clustering is introduced. Using (16.4) and (16.5), we then have that
where th is a random variable showing the number of times stratum h was selected. Then, recalling that µh = µ + αh and linearizing (16.9), we have approximately that
and thus that
since and E [th] = 1. Assuming independence between and th and between the , we have that
Since th follows a multinomial distribution, with var(th) = (L  1)/L and cov(th, ti) =  1/L, we find that
Hence, using (16.12) and (16.15), we obtain
The last term in (16.16) is the effect upon sampling variability of removing stratification. The larger this term, the greater the fall in sampling variability that originates from stratification.
Let us now investigate the effect of clustering on the sampling variance, that is, on var (). We find:
The first line of (16.17) follows by the definition of , and the second line follows from (16.3) — note that αh is fixed for all of the xhij in the same stratum h. The last line of (16.17) is obtained from the sampling independence between βhi and εhij.
Hence, for a perstratum given number of observations mn, it is better to have a large n to reduce sampling variability, namely, it is better to draw observations from a large number of clusters. The larger the crosscluster variability , the more important it is to have a large number of clusters in order to keep var () low. Ceteris paribus, for a given sample size and for a given , the sampling variance of distributive estimators is smaller the smaller the betweencluster heterogeneity, , but the larger the withincluster heterogeneity, .
Sampling without replacement imposes that all of the selected sampling units are different. It therefore extracts on average more information from the sampling base than sampling with replacement, and ensures that the samples drawn are on average closer to the population of sampling units. Sampling without replacement therefore increases the precision of sample estimators. To account for this increase in sampling precision, a FPC factor can be used, although it complicates slightly the estimation of the variance of the relevant estimators.
Assume simple random sampling of n sampling units from a population of N sampling units. Thus, we have that wi = N/n for all of the n sample observations. To illustrate the derivation of an FPC factor in this simplified case, we follow Cochrane (1977) and Deaton (1998), p.4244. An estimator of the population total Y of the x's is given by
where the random variable ti indicates whether — and how many times — the population unit i was included in the sample. Taking the variance of (16.18), we find:
Using (16.18) and (16.19), the distinction between simple random sampling with and without replacement is analogous to the distinction between a binomial and a multinomial distribution for the ti. With sampling without replacement, the probability that any one population unit appears in the final sample is equal to n/N, i.e., E[ti] = n/N. Since ti then takes either a 0 or a 1 value, it thus follows a binomial distribution with parameter n/N. The variance of ti is then given by . The covariance cov(ti, tj) can be found by noting that E[ti tj] = P(ti = tj = 1) = n/N. (n  1)/(N  1), and thus that
Substituting var(ti) and cov(ti, tj) into (16.19), and defining
we find
where 1  f = (N  n)/N is an FPC factor.
Take now the case of simple random sampling with replacement. We can then express ti for any given population unit i as a sum of n independent draws tij, with j = 1,..., n, each one tij indicating whether observation i was selected in draw j. Thus:
Since for any draw j, E[tij] = 1/N, the expected value of ti, is again n/N, but ti may now take values greater than 1. The draws tij being independent, and each draw having a binomial distribution with parameter 1/N, we have that
which is the variance of a multinomial distribution with parameters n and 1/N. It can be checked that the covariance cov(ti, tj) is given by n/N2. Substituting var(ti) and cov(ti, tj) into (16.19) again, we now find
This is larger than (16.22): the difference between the two results equals
and depends on the magnitude of n relative to N. The larger the value of n relative to N, the greater the sampling precision gains that there are in sampling without replacement.
We follow once more the approach of Cochrane (1977) and Deaton (1998), pp.4549. Suppose that we are again interested in estimating the variance of the estimator Y of a total Y, but for simplicity assume that sampling is done with replacement so that we can for now ignore FPC factors. Y is now defined as:
Taking its variance, we find
ti follows once more a multinomial distribution, but now with var(ti) = nπi(l πi) and cov(ti, tj) = nπi πj. Substituting this into (16.28), we find
To estimate (16.29), we can replace population values by sample values and thus use the estimator
Denote as yi = wixi,i = 1,...,n, the n sample values of wixi, and let . Then, (16.30) leads to
with the difference that a familiar n/(n  1) smallsample correction factor has been introduced in (16.31) to correct for the smallsample bias in estimating the variance of the yi. Incorporating weights in the estimation of sampling variances is thus relatively straightforward.
The above material calls to mind the importance for statistical offices of making available sampling design information. This includes providing
the sampling weights;
stratum and PSU (cluster) identifying variables;
information on the presence or not of systematic sampling (and thus of implicit stratification), including the relationship between the numbering of sampling units and the original ordering of these units in the sampling base;
the finite population correction factors, namely, the size of the sampling base, when appropriate.
Equipped with this information, distributive analysts can provide reliable estimates of the sampling precision of their estimators2.
E; 18.9.1
We provide in this section a detailed account of the computation of sampling variances in DAD, taking full account of the sampling design. Let:
h = 1,..., L: the list of the strata (e.g. the geographical regions);
i = 1,... ,Nh: the list of primary sampling units (PSU; e.g., villages) in stratum h;
Nh: the population number of PSU in a stratum h;
nh: the number of selected PSU in a stratum h;
Mhi. the population number of last sampling units (LSU) (e.g., households) in PSU hi;
mhi the number of selected LSU in the PSU hi (for instance, the number of households from village hi that appear in the sample);
qhij: the number of observations in selected LSU hij (e.g., the number of household members in a household hij whose socioeconomic information is recorded in the survey, with each household member providing 1 line of information in the data file);
2DAD: EditSet Sample Design.
whij: the sampling weight of LSU hij;
. the population number of LSU (e.g., the number of households in the population);
the number of selected LSU (e.g., the number of selected households that appear in the sample);
Xhijk: the value of the variable of interest (e.g., adultequivalent income) for statistical unit hijk in the population;
Shijk: the size of statistical unit hijk in the population (e.g., if the statistical unit is a household, then Shijk may be the number of persons in household hijk, or alternatively the number of adult equivalents);
the population total of interest;
xhijk: the value of X (the variable of interest) that appears in the sample for sample observation hijk;
Shijk: the size of selected sample observation hijk;
the estimated population total of interest;
the estimated population number of LSU;
the relevant sum in LSU hij;
the relevant sum in PSU hi;
the relevant mean in stratum h.
The sampling covariance of two totals, and ( being defined similarly to ) is then estimated by
where
— note the similarity with (16.21) and (16.22) — and where fh, is a function of a userspecified FPC factor, fpch, for stratum h, such that,
if a fpch is not specified by the user, then fh = 0;
if fpch ≥ nh, then fh = nh/fpch;
if fpch ≤ 1, then fh = fpch.
Recall that setting fh ≠ 0 is useful only when the sampling design is of the form either of simple random sampling or of stratified random sampling with no subsequent subsampling within the PSU's selected. In both cases, sampling must have been done without replacement.
The variance of is obtained from (16.32) simply by replacing by .
An oftenused indicator of the impact of sampling design on sampling variability is called the design effect, deff. The design effect is the ratio of the designbased estimator of the sampling variance over the estimate of the sampling variance assuming that we have obtained a simple random sample of m LSU without replacement. Denote this latter estimate as . Then,
For simple random sampling, we would have that
and, recalling (16.22), the sampling variance of would then equal
where var(y) is the variance of the population yhij, and where if an FPC factor is specified for the computation of and f = 0 otherwise. VSRS can then be estimated as follows:
Some of the above variables often take familiar forms and names:
xhijk can be thought of as an "individuallevel" variable, such as height, health status, schooling, or own consumption. This variable is called the "variable of interest" in DAD. If xhijk is indeed individualspecific, then shijk will not exceed 1 in most reasonable instances. Individual outcomes are, however, not always observed. Even if they are, we may sometimes believe that there is equal sharing in the household to which individuals belong. In those cases, xhijk will typically take the form of adultequivalent income or other householdspecific measure of living standard.
shijk gives the "size" of the sample observation hijk. This size may be purely demographic, such as the number of individuals in the unit whose living standard is captured by xhijk. It may also be 1 even if hijk represents a household and if we are interested in a household count for distributive analysis. But shijk may also be an ethical size, which depends on normative perceptions of how important the unit is in terms of some distributive analysis. Examples of such sizes include the number of adultequivalents in the unit (if, say, we wish to assign individuals an ethical weight that is proportional to their "needs"), the number of families, the number of adults, the number of workers, the number of children, the number of citizens, the number of voters, etc..
qhij is the number of sample observations or statistical units provided by the last sampling unit. This LSU may contain a grouping of households, of villages, etc... More commonly for the empirical analysis of poverty and equity, a LSU represents a household.
General references on estimation and inference taking into account survey design include Asselin (1984) and Cochrane (1977). Applications to economic analysis are discussed and presented in Deaton (1998), Howes and Lanjouw (1998) (focussing on poverty analysis), and Zheng (2002) (with a specific focus on Lorenz curves). Alternative approaches to taking into account survey design can be found inter alia in Cowell (1989) (modelling sampling weights as jointly distributed with living standards), and in Biewen (2002b) and Schluter and Trede (2002a) (for dependence across members of the same sampling unit — households in their case). Kennickell and Woodburn (1999) illustrates the impact of a consistent estimation of survey weights in the US surveys of Consumer Finances for the analysis of the distribution of wealth.
This page intentionally left blank.
Assessing statistically the extent of poverty and equity in a distribution, or checking for distributive differences, usually involves three steps. First, one formulates hypotheses of interest, such as that the poverty headcount is less than 20%, or that tax equity has increased over time, or that inequality is greater in one country than in another. Second, one computes distributive statistics, weighting observations by their sampling weights and (when appropriate) by a size variable. Third, one uses these statistics to test the hypotheses of interest. This last step can involve testing the hypotheses directly, or building confidence intervals of where we can confidently locate the true population values of interest. This third step may allow for the effects of survey design on the sampling distributions of distributive indices and test statistics, and may also involve performing numerical simulations of such sampling distributions, if the circumstances make it desirable to do so.
Under the null hypothesis that μ = μ0, and under some generally mild regularity conditions, all of the estimators and associated test statistics considered in this book and programmed in DAD can be shown to be asymptotically normally distributed with mean μ0 and asymptotic sampling variance . This can be simply stated as:
The parameter is unknown, but we can typically estimate it consistently by — this is indeed usually readily provided by DAD. Asymptotically, we can then also write that:
which also implies that
a statistics that does not depend on unknown (or "nuisance") parameters, and that is therefore typically called "pivotal". Many of the results that follow rely implicitly on this result.
In the simplest cases, the estimators of interest can be expressed as a straightforward sum of variable values across observations. Take for instance the case of an estimator estimated using a sample of n observations of y1,i:
This is of course just the sample mean of the yi's. As is well known, the asymptotic sampling distribution of is given by
where α1 and are respectively the population mean and the population variance of y. That variance can be estimated consistently by the sample variance of the y1,i's.
Unfortunately, most of the distributive estimators do not take the simple form of (17.4). Instead, they often take the following general form:
where
is expressible as a sum of the n observations of yk,i:
θ can be expressed as a continuous function g of the α's;
and yk,i is usually some kspecific transform of the income of observation i.
The sampling distribution of will depend on the function g and on the joint sampling distribution of the estimators , k = 1,..., K. This joint sampling distribution is usually easily estimated by considering the joint distribution of the (recall (17.4)).
DAD then generally uses Rao (1973)'s linearization approach to derive the standard error of indices such as . Define α = (α1, α2,..., αK)' and let G be the gradient of g with respect to the α's:
A linearization of then yields
The sampling variance of can then be shown to be asymptotically equal to
where V is the asymptotic covariance matrix of the and is given by
The gradient elements ,..., can be estimated consistently using the estimates ,..., of the true derivatives. The elements of the covariance matrix can also be estimated consistently using the sample data, replacing for instance var by . Note that it is at the level of the estimation of these covariance elements that the full sampling design structure is taken into account (see Section 16.6).
The outcome of an hypothesis test is a statistical decision: the conclusion of the test will either be to reject a null hypothesis, H0, in favor of an alternative, H1, or to fail to reject it. Most hypothesis tests involving an unknown true population parameter μ, fall into three special cases:
1 H0: μ = μ0 against H1: μ ≠ μ0;
2 H0: μ ≤ μ0 against H1: μ > μ0;
3 H0: μ ≥ μ0 against H1: μ < μ0;
The ultimate statistical decision may be correct or incorrect. Two types of error can occur:
1 The first one, a Type I error, occurs when we reject H0 when it is in fact true;
2 The second one, a Type II error, occurs when we fail to reject H0 when H0 is in fact false.
The power of the test of an hypothesis H0 versus H1 is the probability of rejecting H0 in favor of H1 when H1 is true.
Let α be the level of statistical significance in which we are interested, α is often referred to as the size of an hypothesis test. It is the probability of making a Type I error, namely, the probability that we may wrongly reject a null hypothesis. Typical values of α are 0.01, 0.025, 0.05 and 0.1. Let z(p) be the pquantile of the standardized normal distribution. That is, if F is a standard normal distribution function, then F(z(p)) ≡ p. Let be the sample estimate of that is, is the value of computed from the sample at hand, and define z0 as z0 = . The rules of rejection and nonrejection of the usual types of hypothesis tests are then as follows1:
1 (Twosided H1) Reject H0: μ = μ0 in favor of H1: μ ≠ μ0 if and only if:
Note that (17.11) is equivalent to:
Note also that the size of such a test is α since, under the null hypothesis, we have that
2 (Lowerbounded H1) Reject H0: μ ≤ μ0 in favor of H1: μ > μ0 if and only if:
Again, (17.14) is equivalent to
3 (Upperbounded H1) Reject H0: μ ≥ μ0 in favor of H1: μ < μ0 if and only if:
which is equivalent to
1DAD: DistributionConfidence Interval.
Table 17.1 sums up the confidence intervals and pvalues for each of the three usual types of hypothesis tests considered above.
The pvalue of an hypothesis test is the smallest significance level for which H0 would be rejected in favor of some H1. Roughly speaking, a p value thus indicates the maximum probability that an error is made when one rejects a null hypothesis in favor of the alternative hypothesis. It therefore gives us the "risk" that there is of rejecting a null hypothesis. The larger the pvalue, the more imprudent it is to reject H0 in favor of H1.
A pvalue is typically compared to some subjective error probability thresholds such as 1%, 5% or 10%. If the pvalue exceeds these thresholds, we do not reject the null hypothesis; if the pvalue lies beneath the threshold, we reject the null hypothesis in favor of the alternative hypothesis.
A confidence interval (or, more generally, a confidence set) is a range of values that is constructed using the sample data and that has a specified probability (1  α) of containing the true parameter of interest μ. The probability value 1  α associated with a confidence interval is known as the confidence level. More formally, let be the "parameter space" of μ, that is, the range of all of the possible values that μ could possibly take. A confidence interval is then an estimate of μ in the sense that there should be a high probability 1  α that μ, is in that interval .
More precisely, a confidence level (1  α) is the probability that μ is in :
Typical confidence levels are 0.9, 0.95 and 0.99. Note that u(1  α) is a random variable since it depends on the particular sample drawn from the population. Roughly speaking, a 1  α confidence level is then the proportion of the times that a confidence interval will include the unknown parameter when independent samples are taken repeatedly from the same population, and that a 1  α confidence interval is calculated for each sample. As for hypothesis tests, confidence intervals can be two sided, lower bounded or upper bounded.
The width of a confidence interval thus gives us some idea about how uncertain we are about the true unknown parameter. In fact, building confidence intervals provides more information than carrying out simple hypothesis tests of the types described above. This is because confidence intervals provide a range of plausible values for the unknown parameter. Looking at Table 17.1, it can also be seen that there is a nice symmetry between the results of hypothesis tests and the confidence intervals that correspond to those tests. Indeed, the confidence intervals of Table 17.1 include all of the hypothesized H0 values that cannot be rejected in favor of the corresponding twosided, lowerbounded or upperbounded H1 hypotheses. Said differently, choosing any μ0 value inside of these confidence intervals will not lead to the rejection of H0 but choosing any value of μ0 outside of these intervals will lead to the rejection of H0 in favor of H12.
The technique of the bootstrap (BTS), inspired in large part by Efron (1979), is being applied with increasing frequency in the applied economics literature. BTS is a method for estimating the sampling distribution of an estimator which proceeds by resampling repetitively one's initial data. For each simulated sample, one recalculates the value of this estimator and then uses the generated BTS distribution to carry out statistical inference. In finite samples, neither the asymptotic nor the BTS sampling distribution is necessarily superior to the other. In infinite samples, they are usually equivalent. When combined together, they usually outperform either approach used individually.
The following steps summarize a typical BTS procedure:
 Draw n observations with replacement from the initial sample by taking into account the precise way in which the original sample was drawn (replicating, for instance, as closely as possible the survey design);
 Repeat the previous step B  1 independent times;
 Assess the sampling distribution of the estimator (for instance, its sampling variance) using the distribution of B simulated values.
Let the vector V be made of B estimates of , each one computed from one of B simulated (or bootstrap) samples. The vector V is the main tool for capturing the sampling distribution of the estimator . Thus, we have:
2DAD: DistributionConfidence Interval.
where is the estimate of computed from the ith bootstrap sample. For twosided tests and confidence intervals with significance level α or confidence level 1  α, the number of simulations should be chosen so that α(B + l)/2 is an integer (to facilitate the computation of critical test values). Let be the pquantile of the vector V: we then have that . The rules of rejection and nonrejection are then:
1 Reject H0: μ = μ0 in favor of H1: μ ≠ μ if and only if:
2 Reject H0: μ ≤ μ0 in favor of H1: μ < μ0if and only if:
3 Reject H0: μ ≥ μ0 in favor of H1: μ < μ0 if and only if:
Table 17.2 summarizes the confidence intervals and pvalues for each of the three usual types of hypothesis tests, using nonpivotal bootstrap statistics. The interpretation and the use of these statistics are analogous to what we saw in Section 17.3.
Let and be respectively the average of the i in V and the estimate of the asymptotic standard deviation of computed from the ith bootstrap sample. Let ti be the following asymptotically pivotal statistics:
ti is asymptotically pivotal since it follows asymptotically a standardized N(0, 1) normal distribution which is free of nuisance parameters, i.e., parameters that are unknown.
Let the vector then be defined as:
and let t*(p) be the pquantile of the vector . The rules of rejection and nonrejection of the usual null hypotheses are then as follows:
1 Reject H0: μ = μ0 in favor of H1: μ ≠ μ0 if and only if3:
2 Reject H0: μ ≤ μ0 in favor of H1: μ > μ0if and only if:
3 Reject H0: μ ≥ μ0 in favor of H1: μ < μ0 if and only if:
Table 17.3 summarizes the confidence intervals and pvalues associated to each of the three usual types of hypothesis tests, using pivotal bootstrap statistics. Again, these statistics can be interpreted and used basically as above in Section 17.3.
Much of the statistical inference literature for distributive analysis has focused on deriving the sampling distribution of inequality and poverty indices. See Cowell (1999) and Davies, Green, and Paarsch (1998) for overall reviews, as well as Aaberge (2001b) for crosscountry evidence of the role of sampling variability, Barrett and Pendakur (1995) for generalized Gini indices, Beach, Chow, Formby, and Slotsve (1994) for decile means, Bishop, Chakraborti, and Thistle (1990) for Sen's welfare index, Bishop, Chakraborti, and Thistle (1991a) for Ginibased relative deprivation indices, Bishop, Chow, and Zheng (1995b) for decomposable poverty indices, Bishop, Formby, and Zheng (1997) for Sen's poverty index, Bishop, Formby, and Zheng (1998) for Ginibased progressivity indices, Chotikapanich and Griffiths (2001) for approximating SGini indices using grouped data, Davidson and Duclos (2000) for various classes of poverty indices with deterministic and estimated poverty lines, Duclos (1997a) for linear progressivity and vertical equity indices, Kakwani (1993) for additive poverty indices, Ogwang (2000) for the Gini index, Preston (1995) for poverty indices with estimated poverty lines, Rongve (1997) for poverty indices with known poverty lines, Rongve and Beach (1997) for the use of approximations to inequality indices, Thistle (1990) for two classes of inequality indices, Van de gaer, Funnell, and McCarthy (1999) and Zheng and Cushing (2001) for comparing inequality across statistically dependent incomes, Xu (1998) for the P(z; ρ = 2) poverty index, and Zheng (2001b) for poverty indices with estimated poverty lines.
3DAD: Distribution Confidence Interval.
The second major area of statistical inference research in distributive analysis has dealt with the sampling distribution of tools for stochastic dominance. This includes Anderson (1996) for integrals of distribution functions, Bahadur (1966) for quantiles, Beach and Davidson (1983) for the Lorenz curve, Bishop, Chakraborti, and Thistle (1989) for Generalized Lorenz curves, Bishop and Formby (1999) for a review, Dardanoni and Forcina (1999) for different inference approaches to ordering Lorenz curves, Davidson and Duclos (1997) for Lorenz and concentration curves, Davidson and Duclos (2000) for primal and dual dominance curves, Klavus (2001) for an application to health care financing in Finland, Maasoumi and Heshmati (2000) for an application to Swedish distributions, Xu (1997) for Generalized Lorenz curves, Xu and Osberg (1998) for "deprivation curves", Zheng, Formby, Smith, and Chow (2000) for meannormalized dominance curves, and Bishop, Chow, and Formby (1994b) and Zheng (1999b) for marginal dominance analysis using Lorenz and quantile curves.
Issues, methods and applications dealing with the multiple hypothesis tests associated to inferring stochastic dominance orderings can be found inter alia in Barrett and Donald (2003) for simulations of the distribution of statistics needed for complete sets of hypothesis tests, Beach and Richmond (1985) for the joint sampling distribution of some of these statistics, Bishop, Formby, and Thistle (1992) and Bishop, Chakraborti, and Thistle (1994a) for applications of the unionintersection approach, Kaur, Prakasa Rao, and Singh (1994) for testing secondorder dominance, Kodde and Palm (1986) for Wald criteria for the joint testing of equality and inequality hypotheses, and Wolak (1989) for testing multivariate inequality constraints.
For general references to the bootstrap, see Efron and Tibshirani (1993) and MacKinnon (2002). Specific applications of the bootstrap and other resampling simulation methods to distributive analysis can be found inter alia in Biewen (2000) (for inequality indices), Biewen (2002a) (for a demonstration of the consistency of bootstrapping inequality, poverty and mobility indices), Mills and Zandvakili (1997) (for inequality indices), Palmitesta, Provasi, and Spera (2000) (for the Gini family of inequality indices), Xu (2000) (for iterated bootstrapping of the SGini indices), and Karagiannis and Kovacevic' (2000) and Yitzhaki (1991) for jackknife calculations of the variance of the Gini.
For the use of the "influence function" in protecting against the possible presence of contaminated data, see Cowell and Victoria Feser (1996b) (for inequality indices), Cowell and Victoria Feser (1996a) (for poverty indices), and Cowell and Victoria Feser (2002) (for social welfare rankings).
Other statistically relevant works can be found (among others) in Elbers, Lanjouw, and Lanjouw (2003) and Hentschel, Lanjouw, Lanjouw, and Poggiet (2000) for "poverty mapping" (the estimation of smallarea statistics on poverty and inequality using various data sources); Breunig (2001) for a bias correction to the estimation of the coefficient of variation; and Lerman and Yitzhaki (1989) for the impact of using aggregated data in the estimation of inequality indices and in making social welfare rankings.
To generate estimation and statistical inference results using DAD, the analyst does not need to specify the functional forms of the distribution of the population of interest. Said differently, to estimate, for instance, poverty and equity indices, or to generate the standard errors of such indices, we do not need to tell DAD that the incomes we are studying are distributed according to a normal, a Pareto, or a beta distribution, for instance. In that sense, all of DAD's results are "distribution free".
In some circumstances, it may however be useful to do distributive analysis conditional on some distributional assumption. Examples of such analysis can be found in Chotikapanich and Griffiths (2002) (estimation of Lorenz curves), Cowell, Ferreira, and Litchfield (1998) (density estimation in Brazil), Cheong (2002) (estimation of US Lorenz curves), Horrace, Schmidt, and Witte (1995) (sampling variability of order statistics using parametric and nonparametric approaches), Ogwang and Rao (2000) (parametric models of Lorenz curves), Ryu and Slottje (1999) (parametric approximations of Lorenz curves), Sarabia, Castillo, and Slottje (1999) and Sarabia, Castillo, and Slottje (2001) (general methods for building parametric models of Lorenz curves), and Schluter and Trede (2002b) (parametric estimation of tails of Lorenz curves).
This page intentionally left blank.
This page intentionally left blank.
Using the file AGGR7 [p.319], compute the poverty headcount by using the variable EXPCAP and a poverty line of 373 FCFA.
Then, find the poverty line which you must use with the variable TTEXP to obtain the same estimate of poverty as that obtained in question 18.1.1.
Using for the variable EXPCAP the poverty line used in question 18.1.1, and for the variable TTEXP the poverty line found in question 18.1.2, decompose poverty across household size GSIZE using EXPCAP and TTEXP. Discuss.
Using again the same file AGGR7 [p.319], decompose poverty across the sex of the household head SEX by using EXPCAP and TTEXP and their associated poverty line used in questions 18.1.1 and 18.1.2. Discuss.
Using the file AGGR7 [p.319], compute total poverty in Cameroon without using the SIZE variable and by using it. Discuss.
Using the file AGGR7 [p.319], decompose total poverty in Cameroon according to the categories captured by GSIZE without using the SIZE variable and by using it. Discuss.
Using the file DECB8 [p.319], decompose total poverty in Cameroon according to the categories captured REGION without using the SIZE variable and by using it. Discuss.
Using the file DECB8 [p.319], compute the average of EXPEQ for the whole of Cameroon and for each of the two regions of REGION.
Then, compute half of these averages for the whole of Cameroon and for each of its two regions in REGION. (These statistics are subsequently used as relative poverty thresholds in 18.3.3.)
Finally, compute the poverty headcount for the whole of Cameroon and for each of its two regions using as poverty lines:
a a national absolute threshold of 373 FCFA;
b the national relative threshold;
c the relative thresholds for each of the two regions.
Check whether using an estimate of the national relative threshold (as opposed to a known or deterministic national relative threshold) has an impact on the standard error of the national headcount.
Computing a food poverty line with a "FEIinspired" method. With LINE6 [p.319], draw a nonparametric regression of FDEQ on CALEQ for an interval of CALEQ of 0 to 4000 calories. Find the level of food expenditures that is expected to yield an intake of 2400 calories per day.
Computing a CBN poverty line. With LINE6 [p.319], draw a nonparametric regression of EXPEQ on FDEQ for an interval of FDEQ which includes the food poverty line estimated in 18.4.1. Find the level of total expenditures expected at a level of food expenditures equal to the food poverty line estimated in 18.4.1.
Using the results of 18.4.1 and 18.4.2, compute the share of food expenditures in the total expenditures of those whose level of food expenditures equals the food poverty line. By dividing the food poverty threshold by this share, estimate a global poverty line.
A second method of estimation of the share of food expenditures in total expenditures. With LINE6, draw a nonparametric regression of FDEQ on EXPEQ for an interval of EXPEQ of 0 to 500 FCFA. Find the level of food expenditures expected at a level of total expenditures equal to the food poverty line estimated in 18.4.1.
Using the results of 18.4.4, compute the share of food expenditures in the total expenditures of those whose level of total expenditures equals the food poverty line estimated in 18.4.1. By dividing the food poverty line by this share, estimate a second global poverty line.
A third method for the estimation of the nonfood poverty line. With LINE6, draw a nonparametric regression of EXPEQ on FDEQ for an interval of FDEQ which includes the food poverty line estimated in 18.4.1. Find the level of total expenditures expected at a level of food expenditures equal to the food poverty line estimated in 18.4.1.
Using the results of 18.4.6, compute the expected nonfood expenditures of those whose level of total expenditures equals the food poverty line estimated in 18.4.1. By adding these expected nonfood expenditures to the food poverty line, estimate a third global poverty line.
Computation of a global poverty line according to the FEI method. With LINE6 [p.319], draw a nonparametric regression of EXPEQ on CALEQ for an interval ranging from 0 to 4000 calories for CALEQ. Estimate the global poverty line that corresponds to 2400 calories per day.
Density functions. With DECB8 [p.319], estimate the density of LEXPEQ for the whole country and for each region (by using REGION).
With AGGR7 [p.319], decompose poverty across SEX. Then check the calculations of the absolute and relative decompositions provided by DAD by separately calculating the poverty indices for each group in SEX. Reconstruct manually the decomposition to verify that DAD gives the correct decomposition results.
Using DECA7 [p.319], decompose total poverty according to the socioeconomic categories AGE and EDUC.
Using DECB8 [p.319], decompose total poverty according to the socioeconomic categories SECT, TYPE and OCCUP.
Using DECA7 [p.319], plot the firstorder dominance curves separately for those who have a primary level and a superior level of education (see variable EDUC) and for poverty lines varying between 0 and 1000 FCFA. What do these curves show?
Using DECA7 [p.319], plot the firstorder dominance curves separately for the femaleheaded and for the maleheaded households, for poverty lines varying between 0 and 300 FCFA. What do these curves indicate? Find the relevant "critical thresholds" and comment.
Repeat 18.7.1 and 18.7.2 for second and thirdorder dominance.
Compute the FGT poverty index for α = 0 and for poverty lines equal to 150, 250 and 300 FCFA, separately for the female and maleheaded households.
Repeat 18.7.4 for second and thirdorder dominance.
Using DECB8 [p.319], draw poverty gap curves separately for the two groups identified by the variable REGION.
Using DECA7 [p.319], draw poverty gap curves separately for the femaleheaded and the maleheaded households.
Using DECB8 [p.319], draw CPG curves separately for the two groups identified by the variable REGION.
Using DECA7 [p.319], draw CPG curves separately for the femaleheaded and the maleheaded households.
Use the file "CAN4"[p.321] to predict the level of taxes paid and benefits received by individuals at different gross incomes X. For this, use the window "nonparametric regression ", and choose alternatively for the x axis the "level" or the "percentile" of gross incomes. What do these regressions indicate?
Use the file "CAN6"[p.321] to draw the Lorenz Curve for gross income (X) and net income (N) in 1990 Canada.
a What does the difference between the two Lorenz curves indicate?
b Then, draw a concentration curve for each of the three transfers B1, B2 et B3 and the tax T. What can you say about the TR progressivity and the "equity" of the distribution of the tax and benefits?
c Would a proportional increase in the benefit B1 combined with a proportional decrease in B3 of the same absolute magnitude be good for inequality, poverty and social welfare?
d Would a proportional increase in the benefit B2 financed by a balancedbudget proportional increase in the tax T be good for inequality, poverty and social welfare?
Use the same file "CAN6"[p.321] to check the IR progressivity of each of the three benefits and the tax T. For this, you can draw concentration curves for X combined separately with each of the three transfers B1, B2 and B3 and the tax T. What can you say about the IR progressivity and the "equity" of the distribution of the tax and benefits? How does it compare with the TRprogressivity results?
Using the file "CAN4"[p.321], compute the concentration indices of each of B and T, and compare them to the Gini index of gross income X. Then, compute an estimate of TR progressivity of the tax and benefit system in Canada.
Compare the Lorenz curve for N with the concentration curve for N (using X as the ranking variable). What does this tell you?
Express the total redistribution exerted by the Canadian tax and transfer system as vertical equity minus horizontal inequity (reranking), using Gini and concentration indices.
Draw the conditional standard deviation of benefits B and taxes T at various values of gross income X.
Draw the conditional standard deviation of net income N at various values of gross income X. What does this indicate?
Draw the share of total taxes T paid by those at different levels of gross income X, and at different ranks of X. Compute this as the ratio of expected taxes over mean gross income µx. Do the same for total benefits B. Compare this to the share of total gross income, computed as X over µX. What does this say?
Compute the average tax rate paid by individuals at different levels of gross income X and at different ranks of X. Estimate this as the expected tax paid at X over X. What does this say about tax progressivity in Canada?
Use the file PERHE12 [p.322] for exercises 18.8.11 to 18.8.17.
Compare the Lorenz curve of per capita total expenditures (EXPCAP), using SIZE, and of total expenditures (TTEXP). Which type of expenditures is more equally distributed? Why?
a To understand better why, add to the graph a concentration curve of total expenditures, using per capita expenditures as the ranking variable, and WHHLD to count observations; this will indicate the concentration of total expenditures among the poorest households, ranked by per capita expenditures.
b To complete your understanding, add a concentration curve for household size, using WHHLD as the aggregating weight and EXPACP as the ranking variable; this will indicate the concentration of individuals among the poorest households, as ranked by EXPCAP. Does this help you understand the difference between the above two Lorenz curves?
Predict the proportion of individuals who visited a public health center and a public hospital in a given month. For this, use the variables CENTRO and HOSPIT, who indicate the proportion of individuals in a household who visited these institutions. Make this prediction at different percentiles of the distribution of per capita total expenditures.
Graph again the concentration curve of total expenditures (TTEXP) using WHHLD and EXPCAP to rank individuals. Compare it to the concentration curve among households (thus use WHHLD) of their use of health centers and public hospitals, which is given respectively by NCENTRO and NHOSPIT, and use EXPCAP to rank households. What does this suggest?
Add to the previous graph the concentration curve of individuals in households. What does this information add to your equity judgement?
Draw on a new graph the concentration curve of total expenditures (TTEXP) using WHHLD and EXPCAP to rank households. Compare this to the concentration curves for access to piped water (PUBWAT) and to sewerage (PUBSEW), using WHHLD as the aggregating weight to draw the curves and EXPCAP as the ranking variable. That is, find out the concentration of access to piped water and sewerage among various proportions of poorest households, and compare that to their share in total expenditures. What do you find?
Add to your previous graph the concentration curves for the number of individuals who have piped water (NPUBWAT) and who have sewerage (NPUBSEW), using household weighting and EXPCAP as the ranking variable. How do you interpret the differences you obtain with the results of question 18.8.15?
Redo the previous analysis of the incidence of access to piped water (PUBWAT) and to sewerage (PUBSEW), but this time use individual weighting (which is usually considered to be the best descriptive choice from a normative or ethical perspective). Thus, draw the Lorenz curve of per capita total expenditures (EXPCAP) using individual weighting, WIND. Compare this to the concentration among individuals of the access to piped water (PUBWAT) and to sewerage (PUBSEW), using WIND as the aggregating weight to draw the curves and EXPCAP as the ranking variable. That is, find out the concentration of access to piped water and sewerage among various proportions of poorest individuals, and compare that to their share of the population and of the total expenditures.
Use the file PERED16 [p.322] for exercises 18.8.18 to 18.8.22.
Make a new graph again of the concentration curve of total expenditures (TTEXP) using WHHLD and EXPCAP to rank households. Compare it to the concentration curve of the number of children at various levels of public education, NPUBPRIM, NPUBSEC and NPUBUNIV using the same aggregating weights. Is education enrolment equitably distributed according to this? What happens to our understanding of the "picture" if we add the concentration curve for the number of children NCHILD?
Now add the Lorenz curve of per capita total expenditures EXPCAP using NCHILD as the size variable. Compare it to the concentration curve of the enrolment of children at various levels of public education, which is given by PUBPRIM, PUBSEC and PUBUNIV, using NCHILD and EXPCAP as the ranking variable. Has your equity judgement evolved?
Redraw the Lorenz curve of per capita total expenditures, now using WCH0612 as the aggregating weight, and compare it to the concentration curve of PUBPRIM using the same aggregating weight.
Redraw the Lorenz curve of per capita total expenditures now using WCH1318 as the aggregating weight, and compare it to the concentration curve of PUBPSEC using the same aggregating weight.
Test the hypothesis that a small increase in secondary school fees combined with a decrease in primary school fees of the same total magnitude would not change the distribution of wellbeing in Peru.
Use the file SENESAM [p.323] to do exercises 18.8.23 to 18.8.26.
Using EXPEQ as ranking variable, predict the proportion of children between 7 and 12 at different levels of living standards who attend primary school. Compare these results to those you obtain when you separate children into boys and girls.
Draw the conditional standard deviation of primary school attendance separately for boys and girls at various values of EXPEQ. What does this indicate?
Draw the conditional standard deviation of primary school attendance separately for each of the three STRATA, and this, at various values of EXPEQ. Explain what you find.
Compare the Lorenz curve for EXPEQ with the concentration curve for EXPEQ (using TTEXP as the ranking variable). What does this suggest?
Use the file ESPMEN [p.325] to do exercises 18.8.27 to 18.8.53. When needed, use EXPEQ as the variable of interest, the headcount as the poverty index, and a poverty line of 60000 FCFA per adult equivalent.
Draw the concentration curves of FDEQ, NFDEQ, HEALTHEQ and SCHEXPEQ for the population of individuals (i.e., setting the size variable to SIZE) and using EXPEQ as the ranking variable. How do these curves compare to the Lorenz curve for EXPEQ?
Compare the Lorenz curves of EXPEQ and INCOMEQ. What do you find? How do you explain this?
Compare the Lorenz curves of EXPEQ for each of the 3 values of DEPT.
Draw the CD curve (normalized by the mean of the variables but not by the poverty lines) of FDEQ and NFDEQ for different poverty lines and for c=l. What does it tell you?
Compute the Gini inequality index for TTEXP, EXPEQ and INCOMEQ. Do this for values of ρ equal to 1, 2 and 3. Then, draw these indices for each of these variables on a graph for ρ ranging from 1 to 5.
Decompose inequality in EXPEQ as a sum of inequality in each of its four components, FDEQ, NFDEQ, HEALTHEQ and SCHEXPEQ.
Draw the share of total SCHEXPEQ of those at different levels of EXPEQ, and at different ranks of EXPEQ. Compute this as the ratio of expected SCHEXPEQ conditional on some value of EXPEQ over that value of EXPEQ. over Do the same for HEALTHEQ. What do you find?
What would the impact on poverty be if we were to transfer 1000 FCFA (per adult equivalent) to each individual in the population?
Where, among the different DEPT, would the impact of group targeting an equal amount to all be the greatest for the same overall budget spent by the government? Does this result depend on the choice of the poverty line?
Where, among the different REGION, would the impact of group targeting an equal amount to all be the greatest for the same overall budget spent by the government?
Assume that some form of government targeting can raise everyone's EXPEQ by the same proportion in a particular area. Per FCFA of overall per capita increase in EXPEQ, for which targeted DEPT would aggregate poverty reduction be the largest? Check this for the headcount and for the average poverty gap indices.
Assume that some form of government targeting can raise everyone's EXPEQ by the same proportion in a particular area. Per FCFA of overall per capita increase in EXPEQ, for which targeted ZONE would aggregate poverty reduction be the largest?
Say that food prices are about to increase by about 5%, due to the removal of food subsidies. In which group within DEPT will poverty increase the most?
Using CD curves, check whether the ZONE for which the impact of an increase in food prices will be the largest depends on the choice of the poverty line and on the choice of poverty index (focus on firstorder poverty indices).
The government wishes to determine whether increasing the price of HEALTHEQ, for the benefit of a fall in the price of SCHEXPEQ, would be good for poverty.
a Compare the distributive cost/benefit of changing the price of each of HEALTHEQ and SCHEXPEQ.
b Check whether the reform is good for poverty for ratios of MCPF ranging from 0.5 to 2.0.
Find the impact on poverty of those within ZONE=1 of a predicted increase of 3% in expenditures EXPEQ.
Find the impact on national poverty of a predicted increase of 3% in the expenditures EXPEQ of those within ZONE=1. Compare your results to those obtained for ZONE=2. Do this for FGT indices with α=0, 1 and 2.
Find the impact on national poverty of a predicted increase of 3% in everyone's expenditures FDEQ. Compare your results to those for a 3% increase in everyone's NFDEQ. Do this for the FGT indices with α=0, 1 and 2.
Per FCFA of growth in overall per capita EXPEQ, in which of ZONE=1 or ZONE=2 is growth in expenditures EXPEQ conducive to greater poverty reduction?
Per FCFA of growth in overall per capita EXPEQ, which of growth in FDEQ or in NFDEQ leads to greater poverty reduction? Graph this for a range of poverty lines and for all poverty indices of the secondorder (α=1, or s=2).
What is the elasticity of poverty with respect to EXPEQ? Compute this for the different DEPT.
The government wishes to determine whether increasing the price of HEALTHEQ by 5%, for the benefit of a revenueneutral fall in the price of SCHEXPEQ, would be good for inequality reduction. Assume a ratio of MCPF=1. Find out the impact on the Lorenz curve and on the Gini coefficient.
Say that food prices are about to increase by about 10%, due to the removal of food subsidies. What is the predicted impact on the Gini index and on L(p = 0.5)?
Find the impact on the Gini index of inequality of a predicted increase of 3% in the expenditures EXPEQ.
Find the impact on the Gini index and on L(p = 0.5) of a predicted increase of 3% in everyone's expenditures FDEQ. Compare your results to those for a similar 3% increase in NFDEQ.
The government wishes to determine whether increasing the price of NFDEQ, for the benefit of a fall in the price of FDEQ, would be good for poverty.
i Assess this for a ratio of the MCPF of NFDEQ over that of FDEQ equal to 1, for a range of poverty lines and for all distributionsensitive poverty indices (secondorder, α=1 or s=2).
ii Up to which ratio of MCPF can we go and still declare the reform to be good for poverty?
iii Are these conclusions also valid for the goal of inequality reduction?
Use the file ESPSANT [p.326] to do exercises 18.8.53 to 18.8.54.. When needed, use the headcount as a poverty index and set the poverty line to 60000 FCFA per adult equivalent.
Using EXPEQ as the ranking variable, predict the proportion of individuals at different levels of living standards whose households make use of public health services. Do this separately for the different values of SEX.
Compute the proportion of EXPEQ that is spent on HEALTHEQ by individuals at different levels and ranks of EXPEQ. What does this suggest?
Use the file SCOL [p.326] to do exercises 18.8.55 to 18.8.57. When needed, use the headcount as the poverty index and set the poverty line to 60000 FCFA per adult equivalent.
Predict the proportion of children below 14 at different values of EXPEQ that attend primary school. Compare the results you obtain across the different values of ZONE. How do these results compare with those for attending secondary school?
Compare the concentration curve (among children below 14) of attendance at primary school, secondary school, public primary school and public secondary school, using EXPEQ as the ranking variable. Draw this for various proportions of the poorest children. Compare this concentration curve with the Lorenz curve for EXPEQ. Discuss your results.
Draw the concentration curves of UNI and SUP. Compare this to the Lorenz curve for EXPEQ for the same population.
Load the file Burkina_94 [p.327] and initialize its sampling design (SD). Do this first only by specifying the variable WEIGHT as sampling weight.
a Compute the mean of total expenditure per adult equivalent (EXEPQ) with the size variable equal to SIZE. Why does STD1 differ from STD2? What is a sufficient condition so that both standard deviations be equal?
b Now use both variables WEIGHT and STRATA to reinitialize the SD of this file. Compute, again, the mean of total expenditure per adult equivalent when the size variable is SIZE, and compare with the STD's of question a. What can be said about the impact of stratification on STD1?
c Now use variables WEIGHT, STRATA and PSU to reinitialize the SD of this file. Compute, again, the mean of total expenditure per adult equivalent when the size variable is SIZE, and compare with the STD's of questions a and b. What can you say about the impact of PSU's on STD1?
d By using the GSE variable to specify the socioprofessional group, compute the mean of total expenditure per adult equivalent when the size variable is SIZE and for groups 1, 2, and 6. How does the sampling variability differ across these estimates?
Load the file Senegal_95 [p.323] and compute the mean of total expenditure per adult equivalent (EXEPQ) after initializing the sampling design with variables STRATA, PSU and WEIGHT.
Wellbeing, in a household, can be represented alternatively by:
a Total expenditure of household "EXP"
b Total expenditure per capita "EXPCAP"
c Total expenditure per adult equivalent "EXPEQ"
Using the variable SIZE to set the size variable, compute the headcount and the average poverty gap indices when the poverty line equals 140000 FCFA and when the variable of interest is alternatively EXPEQ, EXPCAP and EXP. Explain why the results differ.
When sample observations represent households, three size variables are typically used in combination with the variable of interest EXPEQ:
a 1 for all households
b The number of persons in the household (SIZE)
c The number of adult equivalents in the household (EQUI)
Compute the FGT index for α = 0, 1 for every one of these three alternative definitions of the size variable and explain the difference.
Compute the Gini and Atkinson (with ε = 0.5) indices of inequality for:
a Total expenditure when the size variable equals 1 for all households.
b Expenditure per capita when the size variable equals SIZE.
c Expenditure per equivalent adult when the size variable equals SIZE.
d Expenditure per equivalent adult when the size variable equals 1. Comment on the differences between these results.
The files LINE6, AGGR7, DECA7, and DECB8 are made of a subsample of 1000 observations drawn from a survey (the Enquête Camerounaise auprès des ménages or ECAM) on the expenditures and the incomes of households in 1996 Cameroon. The file CAMEROON96 is made of approximately 1700 households from the same survey. The ECAM is a nationally representative survey, with sample selection using twostage stratified random sampling. The first stage consists in the selection of 150 PSUs ("îlot") within each of the six strata, and the second stage consists in the selection of households within each PSU.

Yaoundé 
Douala 
Citiesα 
Rural (3 strata) 
Total 
Households 
336 
384 
360 
630 
1710 
PSU (îlot) 
42 
48 
30 
30 
150 
α Cities <50000 inhabitants.
In Yaoundé and Douala, PSU's are systematically selected with equal probabilities. The number of PSU's drawn by stratum is proportional to the number of urban households found in 1987 in that stratum. In a second stage, 8 households are drawn in every PSU (with equal probabilities), using a list of households established during an enumeration of that PSU.
For the other cities, at the first stage, one city is selected for every one of the ten provinces. Enumeration zones are then drawn with probability proportional to the number of households originally listed in 1987. Households are then drawn as above.
In every one of the three rural strata, two PSU's were selected within the semiurban area and 8 in the rural area. PSU's were again drawn with probability proportional to the number of households enumerated in 1987. Within each selected PSU, 21 households were systematically selected from a household list.
Variables for CAMEROON96
STRATA: 
Stratum of the household 

1 Yaoundé 

2 Douala 

3 Cities 

4 Rural: Forest 

5 Rural: HautsPlateaux 

6 Rural: Savana 
PSU: 
PSU of the household 
WEIGHT: 
Sampling weight 
SIZE: 
Household size 
NADULT: 
Number of adults 
NCHILD: 
Number of children 
EXPEQ: 
Total expenditures per adult equivalent per day 
F_EXP: 
Food expenditures per adult equivalent per day 
NF_EXP: 
Non food expenditures per adult equivalent per day 
INS_LEV: 
Education level of the head of the household 

1 Primary 

2 Professional Training 

3 Secondary 1st cycle 

4 Secondary 2nd cycle 

5 Superior 

6 Not responding 
Variables for LINE6, AGGR7, DECA7, and DECB8
CAN4 and CAN6 contain illustrative data made of a small subsample of observations drawn from the Canadian surveys of Consumer Finance. They contain the following variables:
Variables for CAN4 and CAN6
PERHE12 and PERED16 contain an illustrative sample of some 3600 household observations drawn from the 1994 Peru LSMS survey.
Variables for PERHE12
TTEXP: 
total expenditures of household (constant June 1994 soles per year). 
EXPCAP: 
total expenditures, per capita (constant June 1994 soles per year). 
WHHLD: 
household aggregation weight. 
SIZE: 
household size. 
CENTRO: 
proportion of individuals in the household who used a public health center in last month. 
HOSPIT: 
proportion of individuals in the household who used a public hospital in last month 
PUB WAT: 
household has piped water. 
PUBSEW: 
household has sewerage 
NCENTRO: 
number of individuals in the household who used a public health center in last month 
NHOSPIT: 
number of individuals in the household who used a public hospital in last month 
NPUBWAT: 
number of individuals in household who have access to piped water 
NPUBSEW: 
number of individuals in household who have access to sewerage 
Variables for PERHE16
EXPCAP: 
total expenditures, per capita (constant June 1994 soles per year). 
WHHLD: 
household aggregation weight. 
WIND: 
individual aggregation weight. 
WCHILD: 
child aggregation weight. 
WCH0612: 
aggregation weight for children between 6 and 12. 
WCH1318: 
aggregation weight for children between 13 and 18. 
NCHILD: 
number of children in household (18 and below) 
NCHILD0612: 
number of children between 6 and 12. 
NCHILD1318: 
number of children between 13 and 18. 
NPUBPRIM: 
number of household members in public primary school 
NPUBSEC: 
number of household members in public secondary school 
NPUBUNIV: 
number of household members in public postsecondary school 
PUBPRIM: 
number of household members in public primary school as a proportion of NCHILD 
PUBSEC: 
number of household members in public secondary school as a proportion of NCHILD 
SIZE: 
household size. 
TTEXP: 
total expenditures of household (constant June 1994 soles per year). 
SENEGAL_95 is drawn from a nationally representative survey carried out in 1995 Sénégal (the Enquête sénégalaise auprès des ménages), with sample selection using a multistage stratified random sampling procedure. The country was first split in five strata. The first sampling stage consisted in the selection of PSU's (enumeration areas, or Secteurs d'énumération (SE)) from a 1990 "Master Sample" list with probability proportional to the number of households in the PSU's. 396 SE were thus selected in the urban area and 204 in the rural area. Census districts were then selected within each SE. In a final stage, 15 households were systematically selected within each of the urban census districts, and similarly 24 households were systematically selected within each of the rural census districts.
STRATA 
Households in census 
SE in Master Sample 
Census Districts in ESAM 
# of households in ESAM 
URBAN 
333343 
396 
132 
1980 
+ DAKAR 
187799 
218 
74 
1110 
 High 
69065 
82 
28 
420 
 Medium 
52768 
63 
22 
330 
 Low 
59946 
73 
24 
360 
+ Other urban areas 
145544 
178 
58 
870 
RURAL 
450276 
204 
55 
1320 
TOTAL 
783319 
600 
187 
3300 
Variables for SENEGAL.95
Variables for SENESAM
WEIGHT: 
aggregation weight. 
STRATA: 
survey strata. 
SIZE: 
household size. 
WCH712: 
aggregation weight for children between 7 and 12. 
WMCH712: 
aggregation weight for boys between 7 and 12. 
WFCH712: 
aggregation weight for girls between 7 and 12. 
TTEXP: 
total household expenditures 
EQUI: 
number of adult equivalents in household 
EXPEQ: 
total expenditures per adult equivalent 
SCHEXP: 
household school expenditures 
SCHEXPEQ: 
school expenditures per adult equivalent 
PSCH712: 
proportion of children between 7 and 12 in school. 
PMSCH712: 
proportion of boys between 7 and 12 in school. 
PFSCH712: 
proportion of girls between 7 and 12 in school. 
18.11.5 ESPMEN, ESPSANT and ESPSCOL
These files are drawn from illustrative subsamples of Sénégal's ESP (Enquête Sénégalaise Prioritaire
Variables for ESPMEN
Variables for ESPSANT
WEIGHT: 
aggregation weight. 
SIZE: 
household size. 
DEPT: 
geographical department. 
ZONE: 
geographical zone. 
TTEXP: 
total household expenditures (include health and education) 
EQUI: 
number of adult equivalents in household 
EXPEQ: 
total expenditures per adult equivalent 
HEALTHEQ: 
health expenditures per adult equivalent. 
HEALTHUSE: 
household uses public health services. 
SEX: 
sex of household head 
Variables for ESPSCOL
WEIGHT: 
aggregation weight. 
SIZE: 
household size. 
DEPT: 
geographical department. 
ZONE: 
geographical zone. 
EQUI: 
number of adult equivalents in household 
EXPEQ: 
total expenditures per adult equivalent 
NCHILD: 
number of children 
PRIM: 
proportion of children below 14 going to primary school 
SEC: 
proportion of children below 14 going to secondary school 
UNI: 
proportion of household members attending university. 
SUP: 
proportion of household members attending superior education. 
SEX: 
sex of household head. 
PUBPRIM: 
proportion of children below 14 going to public primary school 
PUBSEC: 
proportion of children below 14 going to public secondary school 
Burkina_94 is drawn from a nationally representative survey (Enquête Prioritaire) carried out in 1994 Burkina Faso with sample selection using twostage stratified random sampling. Seven strata were formed. Five of these strata were rural and two were urban. Enumeration areas (PSU's, or zones de dénombrement) were sampled in a first stage from a list computed from the 1985 census. This firststage sampling within strata 7 (OugadougouBoboDioulasso) was made with equal probability and without replacement. Firststage sampling within the other 6 strata was made with probability proportional to the size (estimated from the 1985 census) of each PSU and without replacement. 20 households were then systematically sampled within each of the selected PSU's in a second stage.
Variables for BURKINA_94
AABERGE, R. (1997): "Interpretation of Changes in RankDependent Measures of Inequality," Economics Letters, 55, 21519.
______(2000): "Characterizations of Lorenz Curves and Income Distributions," Social Choice and Welfare, 17, 63953.
______(2001a): "Axiomatic Characterization of the Gini Coefficient and Lorenz Curve Orderings," Journal of Economic Theory, 101, 11532.
______ (2001b): "Sampling Errors and CrossCountry Comparisons of Income Inequality," Journal of Income Distribution, 10, 6976.
AABERGE, R., A. BJORKLUND, M. JANTTI, M. PALME, P. PEDERSEN, N. SMITH, AND T. WENNEMO (2002): "Income Inequality and Income Mobility in the Scandinavian Countries Compared to the United States," Review of Income and Wealth, 48, 44369.
AABERGE, R., A. BJORKLUND, M. JANTTI, P. PEDERSEN, N. SMITH, AND T. WENNEMO (2000): "Unemployment Shocks and Income Distribution: How Did the Nordic Countries Fare during Their Crises?" Scandinavian Journal of Economics, 102, 7799.
AABERGE, R. AND I. MELBY (1998): "The Sensitivity of Income Inequality to Choice of Equivalence Scales," Review of Income and Wealth, 44, 56569.
ACHDUT, L. (1996): "Income Inequality, Income Composition and Macroeconomic Trends: Israel, 197993," Economica, 63, 27.
AHMAD, E. AND N. STERN (1984): "The Theory of Reform and Indian Indirect Taxes," Journal of Public Economics, 25, 25998.
AHMAD, E. AND N. H. STERN (1991): The Theory and Practice of Tax Reform in Developing Countries, Cambridge.
ALDERMAN, H. (2002): "Do Local Officials Know Something We Don't? Decentralization of Targeted Transfers in Albania," Journal of Public Economics, 83, 375404.
ALTSHULER, R. AND A. SCHWARTZ (1996): "On the Progressivity of the Child Care Tax Credit: Snapshot versus TimeExposure Incidence," National Tax Journal, 49, 5571.
AMIEL., Y. AND F. COWELL (1992): "Measurement of Income Inequality: Experimental Test by Questionnaire," Journal of Public Economics, 47, 326.
______(1997): "The Measurement of Poverty: An Experimental Questionnaire Investigation," Empirical Economics, 22, 57188.
______(1999): Thinking about inequality: Personal judgment and income distributions, Cambridge; New York and Melbourne: Cambridge University Press.
AMIEL, Y, F. COWELL, AND A. POLOVIN (2001): "Risk Perceptions, Income Transformations and Inequality," European Economic Review, 45, 96476.
AMIEL, Y., J. CREEDY, AND S. HURN (1999): "Measuring Attitudes towards Inequality," Scandinavian Journal of Economics, 101, 8396.
ANAND, S. (1977): "Aspects of Poverty in Malaysia," Review of Income and Wealth, 23, 117.
______(1983): Inequality and Poverty in Malaysia, Oxford: Oxford University Press.
ANAND, S. AND M. RAVALLION (1993): "Human Development in Poor Countries: On the Role of Private Incomes and Public Services," Journal of Economic Perspectives, 7, 13350.
ANDERSON, G. (1996): "Nonparametric Tests of Stochastic Dominance in Income Distributions," Econometrica, 64, 118393.
ANKROM, J. (1993): "An Analysis of Horizontal and Vertical Equity in Sweden, the U.S. and the U.K," Scandinavian Journal of Economics, 95, 11924.
ARAAR, A. (2002): "L'impact des variations des prix sur les niveaux d'inégalité et de bienêtre: Une application a la Pologne durant la période de transition. (With English summary.)," L'Actualité Economique/Revue D'Analyse Economique, 78, 22142.
ARAAR, A., AND J.Y. DUCLOS (2003): "An AtkinsonGini Family of Social Evaluation Functions," Economics Bulletin, 3, 116.
ARKES, J. (1998): "Trends in LongRun versus CrossSection Earnings Inequality in the 1970s and 1980s," Review of Income and Wealth, 44, 199213.
ARONSON, R., P. JOHNSON, AND P. LAMBERT (1994): "Redistributive Effect and Unequal Income Tax Treatment," Economic Journal, 104, 26270.
ARONSON, R. AND P. LAMBERT (1994): "Decomposing the Gini Coefficient to Reveal the Vertical, Horizontal, and Reranking Effects of Income Taxation," National Tax Journal, 47, 27394.
ARONSON, R., P. LAMBERT, AND D. TRIPPEER (1999): "Estimates of the Changing Equity Characteristics of the U.S. Income Tax with International Conjectures," Public Finance Review, 27, 13859.
ASSELIN, L.M. (1984): "Techniques de sondage avec applications á l'Afrique," Tech. rep., Centre canadien d'études et de coopération Internationale, CECI/Gaëtan Morin éditeur.
ATKINSON, A. (1970): "On the Measurement of Inequality," Journal of Economic Theory, 2, 24463.
______(1979): "Horizontal Equity and the Distribution of the Tax Burden," in The Economics of Taxation, ed. by H. Aaron and M. Boskin, Washington DC: Brookings Institution.
______(1983): The Economics of Inequality, Oxford: Clarendon Press, 2nd ed.
______(1987): "On the Measurement of Poverty," Econometrica, 55, 749764.
ATKINSON, A. AND F. BOURGUIGNON (2000): Handbook of income distribution. Volume 1, Handbooks in Economics, vol. 16. Amsterdam; New York and Oxford: Elsevier Science, NorthHolland.
ATKINSON, A. AND J. MICKLEWRIGHT (1992): Economic Transformation in Eastern Europe and the Distribution of Income, Cambridge: Cambridge University Press.
AUERBACH, A. AND K. HASSETT (2002): "A New Measure of Horizontal Equity," American Economic Review, 92, 11161125.
BAHADUR, R. (1966): "A Note on Quantiles in Large Samples," Annals of Mathematical Statistics, 37.
BALCER, Y. AND E. SADKA (1986): "Equivalence Scales, Horizontal Equity and Optimal Taxation Under Utilitarianism," Journal of Public Economics, 29, 7997.
BANKS, J. AND P. JOHNSON (1994): "Equivalence Scale Relativities Revisited," Economic Journal, 104, 88390.
BARRETT, C. AND M. SALLES (1995): "On a Generalisation of the Gini Coefficient," Mathematical Social Sciences, 30, 23544.
BARRETT, G., T. CROSSLEY, AND C. WORSWICK (2000a): "Consumption and Income Inequality in Australia," Economic Record, 76, 11638.
______(2000b): "Demographic Trends and Consumption Inequality in Australia between 1975 and 1993," Review of Income and Wealth, 46, 43756.
BARRETT, G. AND S. DONALD (2003): "Consistent Tests for Stochastic Dominance," Econometrica, 71, 71104.
BARRETT, G. AND K. PENDAKUR (1995): "The Asymptotic Distribution of the Generalized Gini Indices of Inequality," Canadian Journal of Economics, 28, 104255.
BARRINGTON, L. (1997): "Estimating Earnings Poverty in 1939: A Comparison of OrshanskyMethod and PriceIndexed Definitions of Poverty," Review of Economics and Statistics, 79, 40614.
BAUM, S. (1987): "On the Measurement of Tax Progressivity: Relative Share Adjustment," Public Finance Quarterly, 15, 16687.
______(1998): "Measuring Tax Progressivity: Compatible Global and Local Indexes," Public Finance Review, 26, 44759.
BEACH, C., V. CHOW, J. FORMBY, AND G. SLOTSVE (1994): "Statistical Inference for Decile Means," Economics Letters, 45, 16167.
BEACH, C. AND R. DAVIDSON (1983): "DistributionFree Statistical Inference with Lorenz Curves and Income Shares," Review of Economic Studies, 50, 72335.
BEACH, C. AND J. RICHMOND (1985): "Joint Confidence Intervals for Income Shares and Lorenz Curves," International Economic Review, 26, 43950.
BEBLO, M. AND T. KNAUS (2001): "Measuring Income Inequality in Euroland," Review of Income and Wealth, 47, 30120.
BECK, J. (1994): "An Experimental Test of Preferences for the Distribution of Income and Individual Risk Aversion," Eastern Economic Journal, 20, 13145.
BEN PORATH, E. AND I. GlLBOA (1994): "Linear Measures, the Gini Index, and the IncomeEquality Tradeoff," Journal of Economic Theory, 64, 44367.
BERLIANT, M. AND R. STRAUSS (1985): "The horizontal and vertical equity characteristics of the federal individual income tax, 19661977," in Horizontal Equity, Uncertainty, and Economic WellBeing, University of Chicago Press.
BERREBI, Z. AND J. SILBER (1981): "Weighting Income Ranks and Levels: A MultipleParameter Generalization For Absolute and Relative Inequality Indices," Economics Letters, 7, 391397.
______(1985): "Income Inequality Indices and Deprivation: A Generalization," Quarterly Journal of Economics, 100, 807810.
BESLEY, T. (1990): "Means Testing Versus Universal Provision in Poverty Alleviation Programmes," Economica, 57, 11929.
BESLEY, T. AND S. COATE (1992): "Workfare versus Welfare Incentive Arguments for Work Requirements in PovertyAlleviation Programs," American Economic Review, 82, 24961.
______(1995): "The Design of Income Maintenance Programs," Review of Economic Studies, 62, 187221.
BESLEY, T. AND R. KANBUR (1988): "Food Subsidies and Poverty Alleviation," The Economic Journal, 98, 701719.
BHORAT, H. (1999): "Distinguishing between Individual and HouseholdLevel Poverty," Development Southern Africa, 16, 15762.
BIEWEN, M. (2000): "Income Inequality in Germany during the 1980s and 1990s," Review of Income and Wealth, 46, 119.
______(2002a): "Bootstrap Inference for Inequality, Mobility and Poverty Measurement," Journal of Econometrics, 108, 31742.
______(2002b): "Measuring Inequality in the Presence of Intrahousehold Correlation," Applied Economics Letters, 9, 100306.
BIGMAN, D. AND P. SRINIVASAN (2002): "Geographical Targeting of Poverty Alleviation Programs: Methodology and Applications in Rural India," Journal of Policy Modeling, 24, 23755.
BIGSTEN, A. AND A. SHIMELES (2003): "Growth and Poverty Reduction in Ethiopia: Evidence from Household Panel Surveys," World Development, 31, 87106.
BISHOP, J., S. CHAKRABORTI, AND P. THISTLE (1989): "Asymptotically DistributionFree Statistical Inference for Generalized Lorenz Curves," The Review of Economics and Statistics, LXXI, 72527.
______(1990): "An Asymptotically DistributionFree Test for Sen's Welfare Index," Oxford Bulletin of Economics and Statistics, 52, 10513.
______(1991a): "Relative Deprivation and Economic Welfare: A Statistical Investigation with GiniBased Welfare Indices," Scandinavian Journal of Economics, 93, 42137.
______(1994a): "Relative Inequality, Absolute Inequality, and Welfare: Large Sample Tests for Partial Orders," Bulletin of Economic Research, 46, 4159.
BISHOP, J., V. CHOW, AND J. FORMBY (1991b): "A Stochastic Dominance Analysis of Growth, Recessions and the U.S. Income Distribution, 19671986," Southern Economic Journal, 57, 93646.
______(1994b): "Testing for Marginal Changes in Income Distributions with Lorenz and Concentration Curves," International Economic Review, 35, 47988.
______(1995a): "The Redistributive Effect of Direct Taxes: A Comparison of Six Luxembourg Income StudyCountries," Journal of Income Distribution, 5, 6590.
BISHOP, J., V. CHOW, AND B. ZHENG (1995b): "Statistical Inference and Decomposable Poverty Measures," Bulletin of Economic Research, 47, 32940.
BISHOP, J. AND J. FORMBY (1999): "Tests of Significance for Lorenz Partial Orders," in Handbook of income inequality measurement. With a foreword by Amartya Sen, ed. by J. Silber, Boston; Dordrecht and London: Kluwer Academic, Recent Economic Thought Series, 31536.
BISHOP, J., J. FORMBY, AND P. LAMBERT (2000): "Redistribution through the Income Tax: The Vertical and Horizontal Effects of Noncompliance and Tax Evasion," Public Finance Review, 28, 33550.
BISHOP, J., J. FORMBY, AND J. SMITH (1991c): "International Comparisons of Income Inequality: Tests for Lorenz Dominance across Nine Countries," Economica, 58, 46177.
______(1991d): "Lorenz Dominance and Welfare: Changes in the U.S. Distribution of Income, 19671986," Review of Economics and Statistics, 73, 13439.
______(1993): "International Comparisons of Welfare and Poverty: Dominance Orderings for Ten Countries," Canadian Journal of Economics, 26, 70726.
BISHOP, J., J. FORMBY, AND P. THISTLE (1991e): "Rank Dominance and International Comparisons of Income Distributions," European Economic Review, 35, 13991409.
______(1992): "Convergence of the South and NonSouth Income Distributions, 19691979," American Economic Review, 82, 26272.
BISHOP, J., J. FORMBY, AND L. ZEAGER (1996): "The Impact of Food Stamps on US Poverty in the 1980s: A Marginal Dominance Analysis," Economica, 63, S14162.
BISHOP, J., J. FORMBY, AND B. ZHENG (1997): "Statistical Inference and the Sen Index of Poverty," International Economic Review, 38, 38187.
______(1998): "Inference Tests for GiniBased Tax Progressivity Indexes," Journal of Business and Economic Statistics, 16, 32230.
BISOGNO, M. AND A. CHONG (2001): "Foreign Aid and Poverty in Bosnia and Herzegovina: Targeting Simulations and Policy Implications," European Economic Review, 45, 102030.
BJORKLUND, A. (1993): "A Comparison between Actual Distributions of Annual and Lifetime Income: Sweden 195189," Review of Income and Wealth, 39, 37786.
BLACKBURN, M. (1989): "Interpreting the Magnitude of Changes in Measures of Income Inequality," Journal of Econometrics, 42, 2125.
______(1998): "The Sensitivity of International Poverty Comparisons," Review of Income and Wealth, 44, 44972.
BLACKLOW, P. AND R. RAY (2000): "A Comparison of Income and Expenditure Inequality Estimates: The Australian Evidence, 197576 to 199394," Australian Economic Review, 33, 31729.
BLACKORBY, C., W. BOSSERT, AND D. DONALDSON (1994): "Generalized Ginis and Cooperative Bargaining Solutions," Econometrica, 62, 11611178.
______(1999): "Income Inequality Measurement: The Normative Approach," in Handbook of income inequality measurement. With a foreword by Amartya Sen, ed. by J. Silber, Boston; Dordrecht and London: Kluwer Academic, Recent Economic Thought Series, 13357.
BLACKORBY, C. AND D. DONALDSON (1978): "Measures of Relative Equality and Their Meaning in Terms of Social Welfare," Journal of Economic Theory, 18, 5980.
______(1980): "Ethical Indices for the Measurement of Poverty," Econometrica, 48, 10531062.
______ (1984): "Ethical Social Index Numbers and the Measurement of Effective Tax/Benefit Progressivity," Canadian Journal of Economics, 17, 68394.
______(1993): "AdultEquivalence Scales and the Economic Implementation of Interpersonal Comparisons of WellBeing," Social Choice and Welfare, 10, 33561.
BLACKORBY, C., D. DONALDSON, AND M. AUERSPERG (1981): "A New Procedure for the Measurement of Inequality Within and Among Population Subgroups," Canadian Journal of Economics, 14, 665686.
BLANCHFLOWER, D. AND A. OSWALD (2000): "WellBeing Over Time in Britain and the USA," Tech. Rep. 7487, National Bureau of Economic Research.
BLUM, W. AND H. KAHEN JR. (1963): The Uneasy Case for Progressive Taxation, Chicago: University of Chicago Press.
BLUNDELL, R. (1998): "Equivalence Scales and Household Welfare: What Can Be Learned from Household Budget Data?" in The distribution of welfare and household production: International perspectives, ed. by S. P. Jenkins, A. Kapteyn, and B. M. S. van Praag, Cambridge; New York and Melbourne: Cambridge University Press, 36480.
BLUNDELL, R. AND A. LEWBEL (1991): "The Information Content of Equivalence Scales," Journal of Econometrics, 50, 4968.
BLUNDELL, R. AND I. PRESTON (1998): "Consumption Inequality and Income Uncertainty," Quarterly Journal of Economics, 113, 60340.
BODIER, M.AND D. COGNEAU, eds. (1998): L'evolution de la structure des prix et les inegalites de niveau de vie en France de 1974 a 1995. (Price Structure Trends and Standard of Living Inequalities in France from 1974 to 1995, With English summary.), vol. 135 of Economie et Prevision.
BORG, M., P. MASON, AND S. SHAPIRO (1991): "The Incidence of Taxes on Casino Gambling: Exploiting the Tired and Poor," American Journal of Economics and Sociology, 50, 32332.
BOSCH, A. (1991): "Economies of Scales, Location, Age, and Sex Discrimination in Household Demand," European Economic Review, 35, 15891595.
BOSSERT, W. (1990): "An Axiomatization of the SingleSeries Ginis," Journal of Economic Theory, 50, 8292.
BOURGUIGNON, F. (1979): "Decomposable Income Inequality Measures," Econometrica, 47, 901920.
BOURGUIGNON, F. AND G. FIELDS (1990): "Poverty Measures and AntiPoverty Policy," Recherches économiques de Louvain, 56, 40927.
______(1997): "Discontinuous Losses from Poverty, Generalized 1 Measures, and Optimal Transfers to the Poor," Journal of Public Economics, 63, 15575.
BOURGUIGNON, F.AND C. MORRISSON (2002): "Inequality among World Citizens: 18201992," American Economic Review, 92, 72744
BRADBURY, B. (1997): "Measuring Poverty Changes with Bounded Equivalence Scales: Australia in the 1980s," Economica, 64,24564.
BREUNIG, R. (2001): "An Almost Unbiased Estimator of the Coefficient of Variation," Economics Letters, 70, 1519.
BUHMANN, B., L. RAINWATER, G. SCHMAUS, AND T. SMEEDING (1988): "Equivalence Scales, WellBeing, Inequality and Poverty: Sensitivity Estimates Across Ten Countries Using the Luxembourg Income Study Database," Review of Income and Wealth, 34, 15142.
BURKHAUSER, R., J. FRICK, AND J. SCHWARZE (1997): "A Comparison of Alternative Measures of Economic WellBeing for Germany and the United States," Review of Income and Wealth, 43, 15371.
BURKHAUSER, R. AND J. POUPORE (1997): "A CrossNational Comparison of Permanent Inequality in the United States and Germany," Review of Economics and Statistics, 79, 1017.
BURKHAUSER, R., T. SMEEDING, AND J. MERZ (1996): "Relative Inequality and Poverty in Germany and the United States Using Alternative Equivalence Scales," Review of Income and Wealth, 42, 381400.
CANCIAN, M. AND D. REED (1998): "Assessing the Effects of Wives' Earnings on Family Income Inequality," Review of Economics and Statistics, 80, 7379.
CANTILLON, S. AND B. NOLAN (2001): "Poverty within Households: Measuring Gender Differences Using Nonmonetary Indicators," Feminist Economics, 7, 523.
CARLSON, M. ANDS. DANZIGER (1999): "Cohabitation and the Measurement of Child Poverty," Review of Income and Wealth, 45, 17991.
CASPERSEN, E. AND G. METCALF (1994): "Is a Value Added Tax Regressive? Annual versus Lifetime Incidence Measures," National Tax Journal, 47, 73146.
CASSADY, K., G. RUGGERI, AND D. VAN WART (1996): "On the Classification and Interpretation of Progressivity Measures," Public Finance Review, 51, 122.
CHAKRAVARTY, S. (1983a): "Ethically Flexible Measures of Poverty," Canadian Journal of Economics, XVI, 7485.
______(1983b): "A New Index of Poverty," Mathematical Social Sciences, 6, 307313.
______(1985): "Normative Indices for the Measurement of Horizontal Inequity," Journal of Quantitative Economics, 1, 8189.
______(1988): "Extended Gini Indices of Inequality," International Economic Review, 29, 147156.
______(1990): Ethical Social Index Numbers, New York, SpringerVerlag.
______(1997): "On Shorrocks' Reinvestigation of the Sen Poverty Index," Econometrica, 65, 124142.
______(1999): "Measuring Inequality: The Axiomatic Approach," in Handbook of income inequality measurement. With a foreword by Amartya Sen, ed. by J. Silber, Boston; Dordrecht and London: Kluwer Academic, Recent Economic Thought, 16384.
______(2001): "Why Measuring Inequality by the Variance Makes Sense from a Theoretical Point of View," Journal of Income Distribution, 10, 8296.
CHAKRAVARTY, S. AND A. CHAKRABORTY (1984): "On Indices of Relative Deprivation," Economics Letters, 14, 283287.
CHAKRAVARTY, S. AND D. MUKHERJEE (1998): "Optimal Subsidy for the Poor," Economics Letters, 61, 31319.
CHAMPERNOWNE, D. AND F. COWELL (1998): Economic inequality and income distribution, Cambridge; New York and Melbourne: Cambridge University Press.
CHANTREUIL, F. AND A. TRANNOY (1999): "Inequality Decomposition Values: The tradeoff between marginality and consistency," Tech. Rep. 9924, THEMA.
CHEN, S., G. DATT, AND M. RAVALLION (1994): "Is Poverty Increasing in the Developing World?" Review of Income and Wealth, 40, 35976.
CHEN, S. AND M. RAVALLION (2001): "How Did the World's Poorest Fare in the 1990s?" Review of Income and Wealth, 47, 283300.
CHEONG, K. (2002): "An Empirical Comparison of Alternative Functional Forms for the Lorenz Curve," Applied Economics Letters, 9, 17176.
CHEW, S. AND L. EPSTEIN (1989): "A Unifying Approach to Axiomatic Nonexpected Utility Theories," Journal of Economic Theory, 49, 20740.
CHOTIKAPANICH, D. AND W. GRIFFITHS (2001): "On Calculation of the Extended Gini Coefficient," Review of Income and Wealth, 47, 54147.
______(2002): "Estimating Lorenz Curves Using a Dirichlet Distribution," Journal of Business and Economic Statistics, 20, 29095.
CHRISTIANSEN, V. AND E. JANSEN (1978): "Implicit social preferences in the Norwegian system of indirect taxation," Journal of Public Economics, 10, 217245.
CLARK, A. AND A. OSWALD (1996): "Satisfaction and Comparison Income," Journal of Public Economics, 61, 359381.
CLARK, S., R. HHAMMING, AND D. ULPH (1981): "On Indices for the Measurement of Poverty," The Economic Journal, 91, 515526.
COCHRANE, W. (1977): Sampling Techniques, New York: Wiley US, 3rd ed.
CONNIFFE, D. (1992): "The Nonconstancy of Equivalence Scales," Review of Income and Wealth, 38, 42943.
CONSTANCE, F. AND R. MICHAEL (1995): Measuring poverty: A new approach, National Research Council, Commission on Behavioral and Social Sciences and Education, Panel on Poverty and Family Assistance, Washington, D.C.: National Academy Press.
CORNEO, G. AND H. GRUNER (2002): "Individual Preferences for Political Redistribution," Journal of Public Economics, 83, 83107.
CORNIA, G. AND F. STEWART (1995): Two Errors of targeting, Public Spending and the Poor: Theory and Evidence, Baltimore: John Hopkins University Press.
CORONADO, J., D. FULLERTON, AND T. GLASS (2000): "The Progressivity of Social Security," Tech. Rep. 7520, National Bureau of Economic Research.
COULOMBE, H. AND A. MCKAY (1998): "La mesure de la pauvreté Vue d'ensemble et méthodologie avec illustration dans le cas du Ghana," L'Actualite économique, 74, 41543.
COULTER, F., F. COWELL, AND S. JENKINS (1992a): "Differences in Needs and Assessment of Income Distributions," Bulletin of Economic Research, 44, 77124.
______(1992b): "Equivalence Scale Relativities and the Measurement of Inequality and Poverty," The Economic Journal, 102, 116.
COWELL, F. (1980): "On the Structure of Additive Inequality Measures," Review of Economic Studies, 47, 521531.
______(1989): "Sampling Variance and Decomposable Inequality Measures," Journal of Econometrics, 42, 2741.
______(1995): Measuring Inequality, Prentice Hall / Harvester Wheatsheaf.
______(1999): "Estimation of Inequality Indices," in Handbook of income inequality measurement. With a foreword by Amartya Sen, ed. by J. Silber, Boston; Dordrecht and London: Kluwer Academic, Recent Economic Thought, 26986.
______(2000): "Measurement of Inequality," in Handbook of income distribution. Volume 1. Handbooks in Economics, vol. 16, ed. by A. B. Atkinson and F. Bourguignon, Amsterdam; New York and Oxford: Elsevier Science, NorthHolland, 87166.
COWELL, F., F. FERREIRA, AND J. LITCHFIELD (1998): "Income Distribution in Brazil 19811990: Parametric and Nonparametric Approaches," Journal of Income Distribution, 8, 6376.
COWELL, F. AND S. JENKINS (1995): "How Much Inequality Can We Explain? A Methodology and an Application to the United States," Economic Journal, 105, 42130.
COWELL, F. AND E. SCHOKKAERT (2001): "Risk Perceptions and Distributional Judgments," European Economic Review, 45, 94152.
COWELL, F. AND M. P. VICTORIA FESER (1996a): "Poverty Measurement with Contaminated Data: A Robust Approach," European Economic Review, 40, 176171.
______(1996b): "Robustness Properties of Inequality Measures," Econometrica, 64, 77101.
______(2002): "Welfare Rankings in the Presence of Contaminated Data," Econometrica, 70, 122133.
CREEDY, J. (1996): "Comparing Tax and Transfer Systems: Poverty, Inequality and Target Efficiency," Economica, 63, S16374.
______(1997): "Lifetime Inequality and Tax Progressivity with Alternative Income Concepts," Review of Income and Wealth, 43, 28395.
______(1998a): "Measuring the Welfare Effects of Price Changes: A Convenient Parametric Approach," Australian Economic Papers, 37, 13751.
______(1998b): "The Welfare Effect on Different Income Groups of Indirect Tax Changes and Inflation in New Zealand," Economic Record, 74, 37383.
______(1999a): "Lifetime versus Annual Income Distribution," in Handbook of income inequality measurement. With a foreword by Amartya Sen, ed. by J. Silber, Boston; Dordrecht and London: Kluwer Academic, Recent Economic Thought, 51333.
______(1999b): "Marginal Indirect Tax Reform in Australia," Economic Analysis and Policy, 29, 114.
______(2001): "Indirect Tax Reform and the Role of Exemptions," Fiscal Studies, 22, 45786.
______(2002): "The GST and Vertical, Horizontal and Reranking Effects of Indirect Taxation in Australia," Australian Economic Review, 35, 38090.
CREEDY, J. AND J. VAN DE VEN (1997): "The Distributional Effects of Inflation in Australia 19801995," Australian Economic Review, 30, 12543.
______(2001): "Decomposing Redistributive Effects of Taxes and Transfers in Australia: Annual and Lifetime Measures," Australian Economic Papers, 40, 18598.
CUTLER, D. AND L. KATZ (1992): "Rising Inequality? Changes in the Distribution of Income and Consumption in the 1980's," The American Economic Review, 82, 546551.
DAGUM, C. (1997): "A New Approach to the Decomposition of the Gini Income Inequality Ratio," Empirical Economics, 22, 51531.
DALTON, H. (1920): "The Measurement of the Inequality of Incomes," The Economic Journal, 30, 34861.
DANZIGER, S. AND P. GOTTSCHALK (1995): America Unequal, Cambridge, MA.: Russell Sage Foundation and Harvard University Press.
DARDANONI, V. AND A. FORCINA (1999): "Inference for Lorenz Curve Orderings," Econometrics Journal, 2, 4975.
DARDANONI, V. AND P. LAMBERT (2001): "Horizontal Inequity Comparisons," Social Choice and Welfare, 18, 799816.
______(2002): "Progressivity Comparisons," Journal of Public Economics, 86, 99122.
DASGUPTA, P. (1993): An inquiry into wellbeing and destitution, Oxford: Oxford University Press.
DASGUPTA, P., A. SEN, AND D. STARRET (1973): "Notes on the Measurement of Inequality," Journal of Economic Theory, 6, 180187.
DATT, G.AND M. RAVALLION (1992): "Growth and Redistribution Components of Changes in Poverty Muasures: a Decomposition with Applications to Brazil and India in the 1980's," Journal of Development Economics, 38, 275295.
______(2002): "Is India's Economic Growth Leaving the Poor Behind?" Journal of Economic Perspectives, 16, 89108.
DAVIDSON, R. AND J.Y. DUCLOS (1997): "Statistical Inference for the Measurement of the Incidence of Taxes and Transfers," Econometrica, 65, 145365.
______(2000): "Statistical Inference for Stochastic Dominance and for the Measurement of Poverty and Inequality," Econometrica, 68, 143564.
DAVIES, H., H. JOSHI, AND L. CLARKE (1997): "Is It Cash That the Deprived Are Short Of?" Journal of the Royal Statistical Society, Series A, 160, 10726.
DAVIES, J., D. GREEN, AND H. PAARSCH (1998): "Economic Statistics and Social Welfare Comparisons: A Review," in Handbook of applied economic statistics, ed. by A. Ullah and D. E. A. Giles, Statistics: Textbooks and Monographs, vol. 155. New York; Basel and Hong Kong: Dekker, 138.
DAVIES, J. AND M. HOY (1994): "The Normative Significance of Using ThirdDegree Stochastic Dominance in Comparing Income Distributions," Journal of Economic Theory, 64, 52030.
______(1995): "Making Inequality Comparisons When Lorenz Curves Intersect," American Economic Review, 85, 98086.
______(2002): "Flat Rate Taxes and Inequality Measurement," Journal of Public Economics, 84, 3346
DAVIS, J. (1959): "A Formal Interpretation of the Theory of Relative Deprivation," Sociometry, 22, 280296.
DE GREGORIO, J. AND J. LEE (2002): "Education and Income Inequality: New Evidence from CrossCountry Data," Review of Income and Wealth, 48, 395416.
DE JANVRY, A. AND E. SADOULET (2000): "Growth, Poverty, and Inequality in Latin America: A Causal Analysis, 197094," Review of Income and Wealth, 46, 267288.
DE VOS, K.AND T. GARNER (1991): "An Evaluation of Subjective Poverty Definitions: Comparing Results from the U.S. and the Netherlands," Review of Income and Wealth, 37, 26785.
DE VOS, K.AND A. ZAIDI (1997): "Equivalence Scale Sensitivity of Poverty Statistics for the Member States of the European Community," Review of Income and Wealth, 43, 31933.
______(1998): "Poverty Measurement in the European Union: CountrySpecific or UnionWide Poverty Lines?" Journal of Income Distribution, 8, 7792.
DEATON, A. (1988): "Quality, Quantity, and Spatial Variation of Price," The American Economic Review, 78, 419430.
______(1998): The Analysis of Household Surveys: A Microeconometric Approach to Development Policy, John Hopkins University Press.
______(2001): "Counting the World's Poor: Problems and Possible Solutions," World Bank Research Observer, 16, 12547.
DECOSTER, A., E. SCHOKKAERT, AND G. VAN CAMP (1997): "Is Redistribution through Indirect Taxes Equitable?" European Economic Review, 41, 599608.
DECOSTER, A. AND G. VAN CAMP (2001): "Redistributive Effects of the Shift from Personal Income Taxes to Indirect Taxes: Belgium 19881993," Fiscal Studies, 22, 79106.
DEININGER, K. AND L. SQUIRE (1998): "New Ways of Looking at Old Issues: Inequality and Growth," Journal of Development Economics, 57, 25987.
DEITEL, H. M. AND P. J. DEITEL (2003): Java How to Program, New York: Prentice Hall.
DEL RIO, C. AND J. RUIZ CASTILLO (2001): "Intermediate Inequality and Welfare: The Case of Spain, 198081 to 199091," Review of Income and Wealth, 47, 22137.
DEL RIO, C. AND J. RUIZ CASTILLO (2001): "TIPs for Poverty Analysis: The Case of Spain, 198081 to 199091," Investigaciones Economicas, 25, 6391.
DESAI, M. AND A. SHAH (1988): "An Econometric Approach to the Measurement of Poverty," Oxford Economic Papers, 40, 505522.
DEUTSCH, J. AND J. SILBER (1997): "Gini's "Transvariazione" and the Measurement of Distance between Distributions," Empirical Economics, 22, 54754.
______(1999a): "Inequality Decomposition by Population Subgroups and the Analysis of Interdistributional Inequality," in Handbook of income inequality measurement. With a foreword by Amartya Sen, ed. by J. Silber, Boston; Dordrecht and London: Kluwer Academic, Recent Economic Thought, 36397.
______(1999b): "On Some Implications of Dagum's Interpretation of the Decomposition of the Gini Index by Population Subgroups," in Advances in econometrics, income distribution and scientific methodology: Essays in honor of Camilo Dagum, ed. by D. J. Slottje, Heidelberg: Physica, 26991.
DILNOT, A., J. KAY, AND C. NORRIS (1984): "The UK Tax System, Strusture and Progressivity, 19481982," Scandinavian Journal of Economics, 86, 15065.
DOLAN, P. AND A. ROBINSON (2001): "The Measurement of Preferences over the Distribution of Benefits: The Importance of the Reference Point," European Economic Review, 45, 16971709.
DOLLAR, D. AND A. KRAAY (2002): "Growth Is Good for the Poor," Journal of Economic Growth, 7, 195225.
DONALDSON, D. (1992): "On the Aggregation of Money Measures of WellBeing in Applied Welfare Economics," Journal of Agricultural and Resource Economics, 17, 88102.
DONALDSON, D. AND J. WEYMARK (1980): "A Single Parameter Generalization of the Gini Indices of Inequality," Journal of Economic Theory, 22, 6786.
______(1983): "Ethically Flexible Gini Indices for Income Distributions in the Continuum," Journal of Economic Theory, 29, 353358.
______(1986): "Properties of fixedPopulation Poverty Indices," International Economic Review, 27, 667688.
DUCLOS, J.Y. (1993): "Progressivity, Redistribution, and Equity, with Application to the British Tax and Benefit System," Public Finance, 48, 35065.
______(1995a): "Assessing the Performance of an Income Tax," Bulletin of Economic Research, 47, 11526.
______(1995b): "On Equity Aspects of Imperfect Poverty Relief," Review of Income and Wealth, 41, 177190.
______(1997a): "The Asymptotic Distribution of Linear Indices of Inequality, and Redistribution," Economics Letters, 54, 5157.
______(1997b): "Measuring Progressivity and Inequality," in Inequality and taxation. Research on Economic Inequality, vol. 7. Greenwich, Conn, ed. by S. Zandvakili, and London: JAI Press, 1937.
______(1998): "Social Evaluation Functions, Economic Isolation and the Suits Index of Progressivity," Journal of Public Economics, 69, 10321.
______(2000): "Gini Indices and the Redistribution of Income," International Tax and Public Finance, 7, 14162.
DUCLOS, J.Y. AND P. GREGOIRE (2002): "Absolute and Relative Deprivation and the Measurement of Poverty," Review of Income and Wealth, 48, 47192.
DUCLOS, J.Y., V. JALBERT, AND A. ARAAR (2003): "Classical horizontal inequity and reranking: an integrated approach," Research on Economic Inequality, 10, 65100.
DUCLOS, J.Y. AND P. LAMBERT (2000): "A Normative Approach to Measuring Classical Horizontal Inequity," Canadian Journal of Economics, 33, 87 113.
DUCLOS, J.Y. AND M. MERCADER PRATS (1999): "Household Needs and Poverty: With Application to Spain and the U.K," Review of Income and Wealth, 45, 7798.
DUCLOS, J.Y.AND M. TABI (1996): "The Measurement of Progressivity, with an Application to Canada," Canadian Journal of Economics, 29.
______(1999): "Inégalité et redistribution du revenu, avec une application au Canada," L'Actualité économique, 75, 95122.
DURO, J. AND J. ESTEBAN (1998): "Factor Decomposition of CrossCountry Income Inequality, 19601990," Economics Letters, 60, 26975.
EBERT, U. (1999): "Using Equivalent Income of Equivalent Adults to Rank Income Distributions," Social Choice and Welfare, 16, 23358.
EBERT, U. AND P. MOYES (2000): "An Axiomatic Characterization of Yitzhaki's Index of Individual Deprivation," Economics Letters, 68, 26370.
______(2003): "Equivalence Scales Reconsidered," Econometrica, 77, 3193.
EFRON, B. (1979): "Bootstrap methods: Another look at the jackknife," The Annals of Statistics, 7, 126.
EFRON, B. AND R. TIBSHIRANI (1993): An introduction to the bootstrap, London: Chapman and Hall.
ELBERS, C., J. LANJOUW, AND P. LANJOUW (2003): "Microlevel Estimation of Poverty and Inequality," Econometrica, 77, 35564.
ERBAS, N. AND C. SAYERS (1998): "Is the United States CPI Biased Across Income and Age Groups?" Tech. Rep. WP/98/136, International Monetary Fund.
ESSAMA NSSAH, B. (1997): "Impact of Growth and Distribution on Poverty in Madagascar," Review of Income and Wealth, 43, 23952.
______(2000): Inégalité, pauvreté et bienétre social  Fondements analytiques et normatifs,Bruxelles: De Boeck & Larcier.
ESSAMANSSAH, B. (2004): "Framing ProPoor Growth within the Logic of Social Evaluation," Tech. rep., Poverty Reduction Group, The World Bank.
FELDSTEIN, M. (1976); "On the Theory of Tax Reform," Journal of Public Economics, 6, 77104.
FELLMAN, J. (1976): "The Effect of Transformations on Lorenz Curves," Econometrica, 44, 8234.
______(2001): "Mathematical Properties of Classes of Income Redistributive Policies," European Journal of Political Economy, 17, 17992.
FELLMAN, J., M. JÄNTTI, AND P. LAMBERT (1999): "Optimal TaxTransfer Systems and Redistributive Policy," Scandinavian Journal of Economics, 101, 11526.
FERREIRA, L., R. BUSE, AND J. P. CHAVAS (1998): "Is There Bias in Computing Household Equivalence Scales?" Review of Income and Wealth, 44, 18398.
FESTINGER, L. (1954): "A Theory of Social Comparison Processes," Human Relations, 7, 117140.
FIELDS, G. (1994): "Data for Measuring Poverty and Inequality Changes in the Developing's," Journal of Development Economics, 44, 87102.
FIELDS, G. AND G. YOO (2000): "Falling Labor Income Inequality in Korea's Economic Growth: Patterns and Underlying Causes," Review of Income and Wealth, 46, 13959.
FINKE, M., W. CHERN, AND J. FOX (1997): "Do the Urban Poor Pay More for Food? Issues in Measurement," Advancing the Consumer Interest, 9, 1317.
FISHBURN, P. AND R. VICKSON (1978): "Theoretical Foundation of Stochastic Dominance," in Stochastic Dominance, ed. by G. Whitmore and M. Findlay, Lexington Books.
FISHBURN, P. AND R. WILLIG (1984): "Transfer Principles in Income Redistribution," Journal of Public Economics, 25, 323328.
FISHER, G. (1992): "The Development and History of the Poverty Thresholds," Social Security Bulletin, 55, 314.
______(1995): "Is there Such a Thing as an Absolute Poverty Line Over Time? Evidence from the United States, Britain, Canada, and Australia on the Income Elasticity of the Poverty Line." Poverty measurement working papers, US Census Bureau.
FLEURBAEY, M., C. HAGNERÉ, AND A. TRANNOY (2003): "Welfare Comparisons with Bounded Equivalence Scales," Journal of Economic Theory., 110, 30996.
FONG, C. (2001): "Social Preferences, SelfInterest, and the Demand for Redistribution," Journal of Public Economics, 82, 22546.
FORMBY, J., H. KIM, AND B. ZHENG (2001): "Sen Measures of Poverty in the United States: Cash versus Comprehensive Incomes in the 1990s," Pacific Economic Review, 6, 193210.
FORMBY, J., T. SEAKS, AND W. SMITH (1984): "Difficulties in the Measurement and Comparison of Tax Progressivity: The Case of North America," Public Finance/Finances Publiques, 39, 297313.
FORMBY, J., J. SMITH, AND B. ZHENG (1999): "The Coefficient of Variation, Stochastic Dominance and Inequality: A New Interpretation," Economics Letters, 62, 31923.
FORMBY, J., W. SMITH, AND D. SYKES (1986): "Income Redistribution and Local Tax Progressivity: A Reconsideration," Canadian Journal of Economics/Revue canadienne d'économique, XIX, 80811.
FORMBY, J., W. SMITH, AND P. THISTLE (1987): "Difficulties in the Measurement of Tax Progressivity: Further Analysis," Public Finance/Finances Publiques, 42, 43845.
______(1990): "The Average Tax Burden and the Welfare Implications of Global Tax Progressivity," Public Finance Quarterly, 18, 324.
FORTIN, B., M. TRUCHON, AND L. BEAUSÉJOUR (1990): "On Reforming the Welfare System: Workfare Meets the Negative Income Tax," Journal of Public Economics, 51, 11951.
FOSTER, J. (1984): "On Economic Poverty: A Survey of Aggregate Measures," in Advances in Econometrics, Connecticut: JAI Press, vol. 3, 215251.
______(1998): "Absolute versus Relative Poverty," American Economic Review, 88, 33541.
FOSTER, J., J. GREER, AND E. THORBECKE (1984): "A Class of Decomposable Poverty Measures," Econometrica, 52, 761776.
FOSTER, J. AND E. OK (1999): "Lorenz Dominance and the Variance of Logarithms," Econometrica, 67, 90107.
FOSTER, J. AND A. SEN (1997): On Economic Inequality after a Quarter Century, Oxford, Clarendon Press.
FOSTER, J. AND A. SHNEYEROV (1999): "A General Class of Additively Decomposable Inequality Measures," Economic Theory, 14, 89111.
______(2000): "Path Independent Inequality Measures," Journal of Economic Theory, 91, 199222.
FOSTER, J. AND A. SHORROCKS (1988a): "Inequality and Poverty Orderings," European Economic Review, 32, 654661.
______(1988b): "Poverty Orderings," Econometrica, 56, 173177.
______(1988c): "Poverty Orderings and Welfare Dominance," Social Choice Welfare, 5, 17998.
FOURNIER, M. (2001): "Inequality Decomposition by Factor Component: A "RankCorrelation" Approach Illustrated on the Taiwanese Case," Recherches Economiques de Louvain/Louvain Economic Review, 67, 381403.
GIBSON, J., J. HUANG, AND S. ROZELLE (2001): "Why Is Income Inequality So Low in China Compared to Other Countries? The Effect of Household Survey Methods," Economics Letters, 71, 32933.
GILES, C. AND P. JOHNSON (1994): "Tax Reform in the UK and Changes in the Progressivity of the Tax System, 198595," Fiscal Studies, 15, 6486.
GINI, C. (1914): "Sulla misura della concentrazione e della variabilita dei caratteri," Atti del Reale istituto Veneto di Scienze, Lettere ed Arti, 73, 12031248.
______(2005): "On the measurement of concentration and variability of characters," Metron: International Journal of Statistics, 63, 338.
GLENNERSTER, H. (2000): "US poverty studies and poverty measurement: the past twentyfive years," CASE paper 42, STICERD.
GLEWWE, P. (1991): "Household Equivalence Scales and the Measurement of Inequality: Transfers from the Poor to the Rich Could Decrease Inequality," Journal of Public Economics, 44, 21116.
______(1992): "Targeting Assistance to the Poor: Efficient Allocation of Transfers when Household Income Is Not Observed," Journal of Development Economics, 38, 297321.
______(2001): Attacking Poverty, Oxford: Oxford University Press.
GOEDHART, T., V. HALBERSTADT, A. KAPTEYN, AND B. VAN PRAAG (1977): "The poverty line: concept and measurement," Journal of Human Resources, XII, 503520.
GOERLICH GISBERT, F. (2001): "On Factor Decomposition of CrossCountry Income Inequality: Some Extensions and Qualifications," Economics Letters, 70, 30309.
GOTTSCHALK, P. AND T. SMEEDING (1997): "CrossNational Comparisons of Earnings and Income Inequality," Journal of Economic Literature, 35, 63387.
______(2000): "Empirical Evidence on Income Inequality in Industrial Countries," in Handbook of income distribution. Volume 1. Handbooks in Economics, vol. 16, ed. by A. B. Atkinson and F. Bourguignon, Amsterdam; New York and Oxford: Elsevier Science, NorthHolland, 261307.
GOUVEIA, M. AND J. TAVARES (1995): "The Distribution of Household Income and Expenditure in Portugal: 1980 and 1990," Review of Income and Wealth, 41, 117.
GRAVELLE, J. (1992): "Equity Effects of the Tax Reform Act of 1986," Journal of Economic Perspectives, 6, 2744.
GREER, J. AND E. THORBECKE (1986): "A Methodology for Measuring Food Poverty Applied to Kenya," Journal of Development Economics, 24, 5974.
GROOTART, C. AND R. KANBUR (1995): "The Lucky Few Amidst Economic Decline: Distributional Change in Côte divoire as Seen Through Panel Data Sets, 198588," The Journal of Development Studies, 31, 603619.
GROSH, M. (1995): "Toward Quantifying the TradeOff: Administrative Costs and Incidence in Targeted Programs in Latin America," in Public spending and the poor: Theory and evidence, ed. by D. van de Walle and K. Nead, Baltimore and London: Johns Hopkins University Press for the World Bank, 45088.
GUERON, J. (1990): "Work and Welfare: Lessons on Employment Programs," Journal of Economic Perspectives, 4, 7998.
GUSTAFSSON, B. AND N. MAKONNEN (1993): "Poverty and Remittances in Lesotho," Journal of African Economies, 2, 4973.
GUSTAFSSON, B. AND L. NIVOROZHKINA (1996): "Relative Poverty in Two Egalitarian Societies: A Comparison between Taganrog, Russia during the Soviet Era and Sweden," Review of Income and Wealth, 42, 32134.
GUSTAFSSON, B. AND L. SHI (1997): "Types of Income and Inequality in China at the End of 1980s," Review of Income and Wealth, 43, 21126.
______(2001): "The Effects of Transition on the Distribution of Income in China: A Study Decomposing the GINI Coefficient for 1988 and 1995," Economics of Transition, 9, 593617.
______(2002): "Income Inequality within and across Counties in Rural China 1988 and 1995," Journal of Development Economics, 69, 179204.
HADDAD, L. AND R. KANBUR (1990): "How Serious is the Neglect of IntraHousehold Inequality?" The Economic Journal, 100, 866881.
______(1992): "Intrahousehold inequality and the theory of targeting," European Economic Review, 36, 37278.
HAGENAARS, A. (1986): The Perception of Poverty, Amsterdam: North Holland Publishing Company.
______(1987): "A Class of Poverty Indices," International Economic Review, 28, 583607.
HAGENAARS, A. AND K. DE VOS (1988): "The Definition and Measurement of Poverty," The Journal of Human Ressources, XXIII, 211221.
HAGENAARS, A. AND B. VAN PRAAG (1985): "A Synthesis of Poverty Line Definitions," The Review of Income and Wealth, 31, 139153.
HAINSWORTH, G. (1964): "The Lorenz Curve as a General Tool of Economic Analysis," Economic Record, 40, 42641.
HANRATTY, M. AND R. BLANK (1992): "Down and Out in North America: Recent Trends in Poverty Rates in the United States and Canada," Quarterly Journal of Economics, 107, 23354.
HARDING, A. (1993): "Lifetime vs Annual TaxTransfer Incidence: How Much Less Progressive?" Economic Record, 69, 17991.
______(1995): "The Impact of Health, Education and Housing Outlays upon Income Distribution in Australia in the 1990s," Australian Economic Review, 0, 7186.
HÄRDLE, W. (1990): Applied Nonparametric Regression, vol. XV, Cambridge, Cambridge university press ed.
HAVEMAN, R. AND A. BERSHADKER (2001): "The "Inability to be SelfReliant" As an Indicator of Poverty: Trends for the U.S., 197597," Review of Income and Wealth, 47, 33560.
HAYES, K., D. SLOTTJE, AND P. LAMBERT (1992): "Evaluating Effective Income Tax Progression," Journal of Public Economics, 56, 46174.
HEADY, C., T. MITRAKOS, AND P. TSAKLOGLOU (2001): "The Distributional Impact of Social Transfers in the European Union: Evidence from the ECHP," Fiscal Studies, 22, 54765.
HENTSCHEL, J., J. LANJOUW, P. LANJOUW, AND J. POGGIET (2000): "Combining Census and Survey Data to Trace the Spatial Dimensions of Poverty: A Case Study of Ecuador," World Bank Economic Review, 14, 14765.
HENTSCHEL, J. AND P. LANJOUW (1996): "Constructing an Indicator of Consumption for the Analysis of Poverty: Principles and Illustrations with Reference to Ecuador," Working Paper 124, World Bank, Living Standards Measurement Study.
HETTICH, W. (1983): "Reform of the Tax Base and Horizontal Equity," National Tax Journal, 36, 41727.
HEY, J. AND P. LAMBERT (1980): "Relative Deprivation and the Gini Coefficient: Comment," Quarterly Journal of Economics, 95, 567573.
HILL, C. AND R. MICHAEL (2001): "Measuring Poverty in the NLSY97," Journal of Human Resources, 36, 72761.
HILLS, J. (1991): "Distributional Effects of Housing Subsidies in the United Kingdom," Journal of Public Economics, 44, 32152.
HORRACE, W., P. SCHMIDT, AND A. WITTE (1995): "Sampling Errors and Confidence Intervals for Order Statistics: Implementing the Family Support Act," National Bureau of Economic Research, 32.
HOWARD, R., G. RUGGERI, AND D. VAN WART (1994): "The redistributional Impact of Taxation in Canada," Canadian Tax Journal, 42.
HOWES, S. AND J. LANJOUW (1998): "Does Sample Design Matter for Poverty Rate Comparisons?" Review of Income and Wealth, 44, 99109.
HUNGERFORD, T. (1996): "The Distribution and Antipoverty Effectiveness of U.S. Transfers, 1992," Journal of Human Resources, 31, 25573.
HYSLOP, D. (2001): "Rising U.S. Earnings Inequality and Family Labor Supply: The Covariance Structure of Intrafamily Earnings," American Economic Review, 91, 75577.
ICELAND, J., K. SHORT, T. GARNER, AND D. JOHNSON (2001): "Are Children Worse Off? Evaluating WellBeing Using a New (and Improved) Measure of Poverty," Journal of Human Resources, 36, 398412.
IDSON, T. AND C. MILLER (1999): "Calculating a Price Index for Families with Children: Implications for Measuring Trends in Child Poverty Rates," Review of Income and Wealth, 45, 21733.
IMMONEN, R., R. KANBUR, M. KEEN, AND M. TUOMALA (1998): "Tagging and Taxing: The Optimal Use of Categorical and Income Information in Designing Tax/Transfer Schemes," Economica, 65, 17992.
JAKOBSSON, U. (1976): "On the Measurement of the Degree of Progression," Journal of Public Economics, 5, 16168.
JANTTI, M. (1997): "Inequality in Five Countries in the 1980s: The Role of Demographic Shifts, Markets and Government Policies," Economica, 64, 41540.
JANTTI, M. AND S. DANZINGER (2000): "Income Poverty in Advanced Countries," in Handbook of income distribution. Volume 1. Handbooks in Economics, vol. 16, ed. by A. B. Atkinson and F. Bourguignon, Amsterdam; New York and Oxford: Elsevier Science, NorthHolland, 30978.
JENKINS, S. (1988a): "Empirical Measurement of Horizontal Inequity," Journal of Public Economics, 37, 305329.
______(1988b): "Reranking and the Analysis of Income Redistribution," Scottish Journal of Political Economy, 35, 6576.
______(1991): "Poverty Measurement and the WithinHousehold Distribution: Agenda for Action," Journal of Social Politics, 20, 457483.
(1995): "Accounting for Inequality Trends: Decomposition Analyses for the UK, 197186," Economica, 62, 2963.
JENKINS, S. AND F. COWELL (1994): "Parametric Equivalence Scales and Scale Relativities," Economic Journal, 104, 891900.
JENKINS, S. AND P. LAMBERT (1997): "Three 'I's of Poverty Curves, with an Analysis of UK Poverty Trends," Oxford Economic Papers, 49, 31727.
______(1998a): "Ranking Poverty Gap Distributions: Further TIPs for Poverty Analysis," in Research on economic inequality. Volume 8. Stamford, Conn, ed. by D. J. Slottje, and London: JAI Press, 3138.
______(1998b): ""Three Ts of Poverty" Curves and Poverty Dominance: Tips for Poverty Analysis," in Research on economic inequality. Volume 8. Stamford, Conn, ed. by D. J. Slottje, and London: JAI Press, 3956.
______(1999): "Horizontal Inequity Measurement: A Basic Reassessment," in Handbook of income inequality measurement. With a foreword by Amartya Sen, ed. by J. Silber, Boston; Dordrecht and London: Kluwer Academic, Recent Economic Thought, 53553.
JENKINS, S. AND N. O'LEARY (1996): "Household Income Plus Household Production: The Distribution of Extended Income in the U.K," Review of Income and Wealth, 42, 40119.
JENSEN, R. AND K. RICHTER (2001): "Understanding the Relationship between Poverty and Children's Health," European Economic Review, 45, 103139.
JOHNSON, D. AND S. SHIPP (1997): "Trends in Inequality Using ConsumptionExpenditures: The U.S. from 1960 to 1993," Review of Income and Wealth, 43, 13352.
JORGENSON, D. (1998): "Did We Lose the War on Poverty?" Journal of Economic Perspectives, 12, 7996.
JUSTER, T. AND K. KUESTER (1991): "Differences in the Measurement of Wealth, Wealth Inequality and Wealth Composition Obtained from Alternative U.S. Wealth Surveys," Review of Income and Wealth, 37, 3362.
KAKWANI, N. (1977a): "Applications of Lorenz Curves in Economic Analysis," Econometrica, 45, 71928.
______(1977b): "Measurement of Tax Progressivity: An International Comparison," Economic Journal, 87, 7180.
______(1980): "On a Class of Poverty Measures," Econometrica, 48, 437446.
______(1984): "On the Measurement of Tax Progressivity and Redistributive Effect of Taxes with Applications to Horizontal and Vertical Equity," Advances in Econometrics, 3, 14968.
______(1986): Analysing Redistribution Policies: A Study Using Australian Data, Cambridge university press ed.
______(1987): "Measures of Tax Progressivity and Redistribution Effect: A Comment," Public Finance/Finances Publiques, 42, 43137.
______(1993): "Statistical Inference in the Measurement of Poverty," Review of Economics and Statistics, 75, 63239.
KAKWANI, N., S. KHANDKER, AND H. SON (2003): "Poverty Equivalent Growth Rate: With Applications to Korea and Thailand," Tech. rep., Economic Commission for Africa.
KAKWANI, N. AND P. LAMBERT (1998): "On Measuring Inequity in Taxation: A New Approach," European Journal of Political Economy, 14, 36980.
______(1999): "Measuring Income Tax Discrimination," Review of Economics and Statistics,81, 2731.
KAKWANI, N. AND E. PERNIA (2000): "What is Pro Poor Growth?" Asian Development Review, 18, 116.
KANBUR, R. (1985): "Budgetary Rules for Poverty Alleviation," Discussion Paper 83, ESRC Programme on Taxation, Incentives and the Distribution of Income.
KANBUR, R. AND L. HADDAD (1994): "Are Better Off Households More Unequal or Less Unequal?" Oxford Economic Papers, 46, 445458.
KANBUR, R., M. KEEN, AND M. TUOMALA(1994a): "Labor Supply and Targeting in Poverty Alleviation Programs," World Bank Economic Review, 8, 191211.
______(1994b): "Optimal Nonlinear Income Taxation for the Alleviation of IncomePoverty," European Economic Review, 38, 161332.
KAPLOW, L. (1989): "Horizontal Equity: Measures in Search of a Principle," National Tax Journal, XLII, 139154.
______(1995): "A Fundamental Objection to Tax Equity Norms: A Call for Utilitarianism," National Tax Journal, 48, 497514.
______(2000): "Horizontal Equity: New Measures, Unclear Principles," Working Paper 7649, Natuonal Bureau of Economic Research, NBER.
KARAGIANNIS, E. AND M. KOVACEVIC' (2000): "A Method to Calculate the Jackknife Variance Estimator for the Gini Coefficient," Oxford Bulletin of Economics and Statistics, 62, 11922.
KAUR, A., B. L. S. PRAKASA RAO, AND H. SINGH (1994): "Testing for SecondOrder Stochastic Dominance of Two Distributions," Econometric Theory, 10, 84966.
KEEN, M. (1992): "Needs and Targeting," Economic Journal, 102, 6779.
KEEN, M., H. PAPAPANAGOS, AND A. SHORROCKS (2000): "Tax Reform and Progressivity," Economic Journal, 110, 5068.
KEENEY, M. (2000): "The Distributional Impact of Direct Payments on Irish Farm Incomes," Journal of Agricultural Economics, 51, 25265.
KENNICKELL, A. AND L. WOODBURN (1999): "Consistent Weight Design for the 1989, 1992 and 1995 SCFs, and the Distribution of Wealth," Review of Income and Wealth, 45, 193215.
KHETAN, C. AND S. PODDAR (1976): "Measurement of Income Tax Progression in A Growing Economy: The Canadian Experience," Canadian Journal of Economics/Revue canadienne d'économique, IX, 61329.
KIEFER, D. (1984): "Distributional Tax Progressivity Indexes," National Tax Journal, 37, 497513.
KING, M. (1983): "An Index of Inequality: With Applications to Horizontal Equity and Social Mobility," Econometrica, 51, 99116.
KLASEN, S. (2000): "Measuring Poverty and Deprivation in South Africa," Review of Income and Wealth, 46, 3358.
______(2003): "In Search of the Holy Grail: How to Achieve ProPoor Growth?" Tech. Rep. Discussion Paper #96, IberoAmerica Institute for Economic Research, GeorgAugustUniversität, Gttingen.
KLAVUS, J. (2001): "Statistical Inference of Progressivity Dominance: An Application to Health Care Financing Distributions," Journal of Health Economics, 20, 36377.
KODDE, D. A. AND F. C. PALM (1986): "Wald Criteria for Jointly Testing Equality and Inequality Restrictions," Econometrica, 54, 12431248.
KOLM, S.C. (1969): "The Optimal Production of Justice," in Public Economics, ed. by J. Margolis and S. Guitton, London: MacMillan.
______(1976a): "Unequal Inequalities, I," Journal of Economic Theory, 12, 41642.
______(1976b): "Unequal Inequalities, II," Journal of Economic Theory, 13, 82111.
KROLL, Y. AND L. DAVIDOVITZ (2003): "Inequality Aversion versus Risk Aversion," Economica, 70, 1929.
KUNDU, A.AND T. SMITH (1983): "An Impossibility Theorem on Poverty Indices," International Economic Review, 24, 423434.
LAMBERT, P. (1993): "Evaluating Impact Effects of Tax Reforms," Journal of Economic Surveys, 7, 20542.
______(2001): The distribution and redistribution of income, Third edition. Manchester and New York: Manchester University Press; distributed by Palgrave, New York.
LAMBERT, P. AND W. PFAHLER (1992): "Income Tax Progression and Redistributive Effect: The Influence of Changes in the Pretax Income Distribution," Public Finance, 47, 116.
LAMBERT, P. AND X. RAMOS (1997a): "Horizontal Inequity and Reranking: A Review and Simulation Study," in Inequality and taxation. Research on Economic Inequality, vol. 7. Greenwich, Conn, ed. by S. Zandvakili, and London: JAI Press, 118.
______(1997b): "Horizontal Inequity and Vertical Redistribution," International Tax and Public Finance, 4, 2537.
LAMBERT, P. AND S. YITZHAKI (1995): "Equity, Equality and Welfare," European Economic Review, 39, 67482.
LANCASTER, G. AND R. RAY (2002): "International Poverty Comparisons on Unit Record Data of Developing and Developed Countries," Australian Economic Papers, 41, 12939.
LANJOUW, J. AND P. LANJOUW (2001): "How to Compare Apples and Oranges: Poverty Measurement Based on Different Definitions of Consumption," Review of Income and Wealth, 47, 2542.
LANJOUW, P. AND M. RAVALLION (1995): "Poverty and Household Size," Economic Journal, 105, 141534.
______(1999): "Benefit Incidence, Public Spending Reforms, and the Timing of Program Capture," World Bank Economic Review, 13, 25773.
LATHAM, R. (1993): "A Necessary and Sufficient Condition for Greater Conditional Progressivity," Canadian Journal of Economics, 26, 91932.
LAYTE, R., B. MAITRE, B. NOLAN, AND C. WHELAN (2001): "Persistent and Consistent Poverty in the 1994 and 1995 Waves of the European Community Household Panel Survey," Review of Income and Wealth, 47, 42749.
LAZEAR, E. AND R. MICHAEL (1988): Allocation of Income Within the Household, Chicago: The University Press.
LE BRETON, M., P. MOYES, AND A. TRANNOY (1996): "Inequality Reducing Properties of Composite Taxation," Journal of Economic Theory, 69, 71103.
LEIBBRANDT, M., C. WOOLARD, AND I. WOOLARD (2000): "The Contribution of Income Components to Income Inequality in the Rural Former Homelands of South Africa: A Decomposable Gini Analysis," Journal of African Economies, 9, 7999.
LERMAN, R. (1999): "How Do Income Sources Affect Income Inequality?" in Handbook of income inequality measurement. With a foreword by Amartya Sen, ed. by J. Silber, Boston; Dordrecht and London: Kluwer Academic, Recent Economic Thought, 34158.
LERMAN, R. AND S. YITZHAKI (1985): "Income Inequality Effects by Income Source: A New Approach and Applications to the United States," Review of Economics and Statistics, 67, 151156.
______(1989): "Improving the Accuracy of Estimates of Gini Coefficients," Journal of Econometrics, 42, 4347.
______(1995): "Changing Ranks and the Inequality Impacts of Taxes and Transfers," National Tax Journal, 48, 4559.
LEWBEL, A. (1989): "Household Equivalence Scales and Welfare Comparisons," Journal of Public Economics, 39, 377391.
LEWIS, J. AND W. LOFTUS (2000): Java, Software Solutions, USA: Addsion Wesley, second ed.
LIBERATI, P. (2001): "The Distributional Effects of Indirect Tax Changes in Italy," International Tax and Public Finance, 8, 2751.
LIPTON, M.AND M. RAVALLION (1995): "Poverty and Policy," in Handbook of development economics. Volume 3B, ed. by J. Behrman and T. N. Srinivasan, Amsterdam; New York and Oxford: Elsevier Science, North Holland, 25512657.
LIU, P. W. (1985): "Lorenz Domination and Global Tax Progressivity," Canadian Journal of Economics/Revue canadienne d'Economique, XVIII, 39599.
LOOMIS, J. AND C. REVIER (1988): "Measuring Progressivity of Excise Taxes: A Buyers Index," Public Finance Quarterly, 16, 30114.
LORENZ, M. (1905): "Method of measuring the concentration of wealth," Publications of the American Statistical Association, 70.
LUNDBERG, S., R. POLLAK, AND T. WALES (1997): "Do Husbands and Wives Pool Their Resources? Evidence from the United Kingdom Child Benefit," Journal of Human Resources, 32, 46380.
LUNDIN, D. (2001): "WelfareImproving Carbon Dioxide Tax Reform Taking Externality and Location into Account," International Tax and Public Finance, 8, 81535.
LYON, A. AND R. SCHWAB (1995): "Consumption Taxes in a LifeCycle Framework: Are Sin Taxes Regressive," Review of Economics and Statistics, 77, 398406.
LYSSIOTOU, P. (1997): "Comparison of Alternative Tax and Transfer Treatment of Children Using Adult Equivalence Scales," Review of Income and Wealth, 43, 10517.
MAASOUMI, E. AND A. HESHMATI (2000): "Stochastic Dominance amongst Swedish Income Distributions," Econometric Reviews, 19, 287320.
MACKINNON, J. (2002): "Bootstrap inference in econometrics," Canadian Journal of Economics/Revue canadienne d'Economique, 35, 615645.
MADDEN, D. (2000): "Relative or Absolute Poverty Lines: A New Approach," Review of Income and Wealth, 46, 18199.
MAKDISSI, P. AND Y. GROLEAU (2002): "Que pouvonsnous apprendre des profils de pauvreté canadiens?" L'Actualite Economique/Revue D'Analyse Economique, 78, 25786.
MAKDISSI, P. AND Q. WODON (2002): "Consumption Dominance Curves: Testing for the Impact of Indirect Tax Reforms on Poverty," Economics Letters, 75, 22735.
MAYERES, I. AND S. PROOST (2001): "Marginal Tax Reform, Externalities and Income Distribution," Journal of Public Economics, 79, 34363.
MAYSHAR, J. AND S. YITZHAKI (1995): "DaltonImproving Indirect Tax Reform," American Economic Review, 85, 793807.
______(1996): "DaltonImproving Tax Reform: When Households Differ in Ability and Needs," Journal of Public Economics, 62, 399412.
MCCLEMENTS, L. (1977): "Equivalence Scales for Children," Journal of Public Economics, 8, 191210.
MCCULLOCH, N. AND B. BAULCH (1999): "Tracking propoor growth," Tech. Rep. ID21 insights #31, Sussex, Institute of Development Studies.
MEENAKSHI, J. V.AND R. RAY (2002): "Impact of Household Size and Family Composition on Poverty in Rural India," Journal of Policy Modeling, 24, 53959.
MEHRAN, F. (1976): "Linear Measures of Income Inequality," Econometrica, 44, 80509.
MERCADER PRATS, M. (1997): "On the Distributive and Incentive Effects of the Spanish Income Tax: A Comparison of 1980 and 1994," European Economic Review, 41, 60917.
MERTON, R. K. AND A. S. ROSSI (1957): "Contributions to the Theory of Reference Group Behaviour," in Social Theory and Social Structure, ed. by R. Merton, Glencoe.
METCALF, G. (1994): "The Lifetime Incidence of State and Local Taxes: Measuring Changes During the 1980s," Tech. Rep. 4252, National Bureau of Economic Research.
MILANOVIC, B. (1992): "Poverty in Poland, 191888," Review of Income and Wealth, 38, 32940.
______(1994a): "Comment on 'Income Tax Progression and Redistributive Effect: The Influence of Changes in the PreTax Income Distribution," Public Finance / Finances Publiques, 49, 12633.
 (1994b): "The GiniType Functions: An Alternative Derivation," Bulletin of Economic Research, 46, 8190.
______(1995): "The Distributional Impact of Cash and InKind Transfers in Eastern Europe and Russia," in Public spending and the poor: Theory and evidence, ed. by D. van de Walle and K. Nead, Baltimore and London: Johns Hopkins University Press for the World Bank, 489520.
______(1997): "A Simple Way to Calculate the Gini Coefficient, and Some Implications," Economics Letters, 56, 4549.
______(2002): "True World Income Distribution, 1988 and 1993: First Calculation Based on Household Surveys Alone," Economic Journal, 112, 5192.
MILANOVIC, B. AND S. YITZHAKl (2002): "Decomposing World Income Distribution: Does the World Have a Middle Class?" Review of Income and Wealth, 48, 15578.
MILLS, J. AND S. ZANDVAKILI (1997): "Statistical Inference via Bootstrapping for Measures of Inequality," Journal of Applied Econometrics, 12, 13350.
MITRAKOS, T. AND P. TSAKLOGLOU (1998): "Decomposing Inequality under Alternative Concepts of Resources: Greece 1988," Journal of Income Distribution, 8, 24153.
MOFFITT, R. (1989): "Estimating the Value of an InKind Transfer: The Case of Food Stamps," Econometrica, 57, 385409.
MOOKHERJEE, D. AND A. SHORROCKS (1982): "A Decomposition Analysis of the Trend in U.K. Income Inequality," Economic Journal, 92, 886902.
MORDUCH, J. (1998): "Poverty, Economic Growth, and Average Exit Time," Economics Letters, 59, 38590.
MORDUCH, J. AND T. SICULAR (2002): "Rethinking Inequality Decomposition, with Evidence from Rural China," Economic Journal, 112, 93106.
MORRIS, C. AND I. PRESTON (1986): "Taxes, Benefits and the Distribution of Income 196883," Fiscal Studies, 7, 1827.
MOSLER, K. AND P. MULIERE (1996): "Inequality Indices and the Starshaped Principle of Transfers," Statistical Papers, 37, 34364.
MOYES, P. (1999): "Stochastic Dominance and the Lorenz Curve," in Handbook of income inequality measurement. With a foreword by Amartya Sen. Recent Economic Thought Series, ed. by J. Silber, Boston; Dordrecht and London: Kluwer Academic, 199222.
MOYES, P. AND A, SHORROCKS (1998): "The Impossibility of a Progressive Tax Structure," Journal of Public Economics, 69, 4965.
MULIERE, P. AND M. SCARSINI (1989): "A Note on Stochastic Dominance and Inequality Measures," Journal of Economic Theory, 49, 31423.
MULLER, C. (2002): "Prices and Living Standards: Evidence for Rwanda," Journal of Development Economics, 68, 187203.
MUSGRAVE, R. (1959): The Theory of Public Finance, New York: McGrawHill.
______(1990): "Horizontal Equity, Once More," National Tax Journal, 43, 113122.
MUSGRAVE, R. AND T. THIN (1948): "Income Tax Progression 192948," The Journal of Political Economy, 56, 498514.
MYLES, J. AND G. PICOT (2000): "Poverty Indices and Policy Analysis," Review of Income and Wealth, 46, 16179.
NARAYAN, D. AND M. WALTON (2000): Voices of the poor: Crying out for change, New York and Oxford: Oxford University Press for the World Bank.
NELISSEN, J. (1998): "Annual versus Lifetime Income Redistribution by Social Security," Journal of Public Economics, 68, 22349.
NEWBERY, D. (1995): "The Distributional Impact of Price Changes in Hungary and the United Kingdom," Economic Journal, 105, 84763.
NICOL, C. (1994): "Identifiability of Household Equivalence Scales through Exact Aggregation: Some Empirical Results," Canadian Journal of Economics, 27, 30728.
NOLAN, B. (1987): "Direct Taxation, Transfers and Reranking: Some Empirical Results for the UK," Oxford Bulletin of Economics and Statistics, 49, 273 290.
NOLAN, B. AND C. WHELAN (1996): "The Relationship between Income and Deprivation: A Dynamic Perspective," Revue Economique, 47, 70917.
NORREGAARD, J. (1990): "Progressivity of Income Tax Systems," OECD Economic Studies, 15, 83110.
NOZICK, R. (1974): Anarchy, State and Utopia, Oxford: Basil Blackwell.
OGWANG, T. (2000): "A Convenient Method of Computing the Gini Index and Its Standard Error," Oxford Bulletin of Economics and Statistics, 62, 12329.
OGWANG, T. AND G. RAO (2000): "Hybrid Models of the Lorenz Curve," Economics Letters, 69, 3944.
O'HIGGINS, M. AND P. RUGGLES (1981): "The Distribution of Public Expenditures and Taxes Among Households in the United Kingdom," The Review of Income and Wealth, 27, 298326.
O'HIGGINS, M., G. SCHMAUS, AND G. STEPHENSON (1989): "Income Distribution and Redistribution: A Microdata Analysis for Seven Countries," Review of Income and Wealth, 35, 10731.
OK, E. (1995): "Fuzzy Measurement of Income Inequality: A Class of Fuzzy Inequality Measures," Social Choice and Welfare, 12, 11136.
______(1997): "On Opportunity Inequality Measurement," Journal of Economic Theory, 77, 300329.
OKUN, A. (1975): "Equality and Efficiency: The Big Tradeoff," Tech. rep., Brookings Institution.
O'NEILL, D. AND O. SWEETMAN (2001): "Inequality in Ireland 19871994: A Comparison Using Measures of Income and Consumption," Journal of Income Distribution, 10, 2339.
ORSHANSKY, M. (1965): "Counting the Poor: Another Look at the Poverty Profile," Social Security Bulletin, 28, 329.
______(1988): "Counting the poor: another look at the poverty profile," Social Security Bulletin, 51, 2551.
OSBERG, L. (2000): "Poverty in Canada and the United States: measurements, trends, and implications," Canadian Journal of Economics, 33, 847877.
OSBERG, L. AND K. XU (1999): "Poverty Intensity: How Well Do Canadian Provinces Compare?" Canadian Public Policy, 25, 17995.
______(2000): "International comparisons of poverty intensity: Index decomposition and bootstrap inference," Journal of Human Resources, 35, 5181.
OWEN, G. (1977): "Values of games with a priori unions," in Essays in Mathematical Economics and Game Theory, ed. by R. Heim and O. Moeschlin, New York: Springer Verlag.
PALMITESTA, P., C. PROVASI, AND C. SPERA (2000): "Confidence Interval Estimation for Inequality Indices of the Gini Family," Computational Economics, 16, 13747.
PARK, A., S. WANG, AND G. WU (2002): "Regional Poverty Targeting in China," Journal of Public Economics, 86, 12353.
PARKER, S. (1999): "The Inequality of Employment and SelfEmployment Incomes: A Decomposition Analysis for the U.K," Review of Income and Wealth, 45, 26374.
PARKER, S. AND F. SIDDIQ (1997): "Seeking a Comprehensive Measure of Economic WellBeing: Annuitisation versus Capitalisation," Economics Letters, 54, 24144.
PATTANAIK, P. AND M. SENGUPTA (1995): "On Alternative Axiomatization of Sen's Poverty Measure," Review of Income and Wealth, 41, 7380.
PAUL, S. (1991): "An Index of Relative Deprivation," Economics Letters, 36, 33741.
PECHMAN, J. (1985): Who Paid the Taxes 19661985?, Brookings Institution, Washington.
PEN, J. (1971): Income Distribution: facts, theories, policies, New York: Preaeger.
PENDAKUR, K. (1999): "Semiparametric Estimates and Tests of BaseIndependent Equivalence Scales," Journal of Econometrics, 88, 140.
______(2001): "Consumption Poverty in Canada, 1969 to 1998," Canadian Public Policy, 27, 12549.
______(2002): "Taking Prices Seriously in the Measurement of Inequality," Journal of Public Economics, 86, 4769.
PERSSON, M. AND P. WISSEN (1984): "Redistributional Aspects of Tax Evasion," The Scandinavian Journal of Economics, 86, 13149.
PFAHLER, W. (1983): "Measuring Redistributional Effects of Tax Progressivity by Lorenz Curves," Jahrbücher für Nationalökonomie und Statistik, 198, 23749.
______(1987): "Redistributive Effects of Tax Progressivity: Evaluating a General Class of Aggregate Measures," Public Finance/Finances Publiques, 42, 131.
PHIPPS, S. (1993): "Measuring Poverty among Canadian Households: Sensitivity to Choice of Measure and Scale," Journal of Human Resources, 28, 16284.
______(1998): "What Is the Income "Cost of a Child"? Exact Equivalence Scales for Canadian TwoParent Families," Review of Economics and Statistics, 80, 15764.
PHIPPS, S. AND P. BURTON (1995): "Sharing within Families: Implications for the Measurement of Poverty among Individuals in Canada," Canadian Journal of Economics, 28, 177204.
PHIPPS, S. AND T. GARNER (1994): "Are Equivalence Scales the Same for the United States and Canada?" Review of Income and Wealth, 40, 117.
PLOTNICK, R. (1981): "A Measure of Horizontal Inequity," The Review of Economics and Statistics, LXII, 283288.
______(1982): "The Concept and Measurement of Horizontal Inequity," Journal of Public Economics, 17, 373391.
______(1985): "A Comparison of Measures of Horizontal Inequity," Martin david and timothy smeeding.
______(1999): "Comments on Horizontal inequity measurement: a basic reassessment," in Handbook of Income Inequality Measurement, ed. by J. S. ed., Kluwer Academic Publishers, 554556.
PODDER, N. (1993): "The Disaggregation of the Gini Coefficient by Factor Components and Its Applications to Australia," Review of Income and Wealth, 39, 5161.
______(1996): "Relative Deprivation, Envy and Economic Inequality," Kyklos, 49, 35376.
PODDER, N. AND S. CHATTERJEE (2002): "Sharing the National Cake in Post Reform New Zealand: Income Inequality Trends in Terms of Income Sources," Journal of Public Economics, 86, 127.
PODDER, N. AND P. MUKHOPADHAYA (2001): "The Changing Pattern of Sources of Income and Its Impact on Inequality: The Method and Its Application to Australia, 197594," Economic Record, 77, 24251.
POLLAK, R. (1991): "Welfare Comparisons and Situation Comparisons," Journal of Econometrics, 50, 3148.
PRADHAN, M. AND M. RAVALLION (2000): "Measuring Poverty Using Qualitative Perceptions of Consumption Adequacy," Review of Economics and Statistics, 82, 46271.
PRATT, J. (1964): "Risk Aversion in the Small and in the Large," Econometrica, 32, 12236.
PRESTON, I. (1995): "Sampling Distributions of Relative Poverty Statistics," Applied Statistics, 44, 9199.
PRICE, D. AND S. NOVAK (1999): "The Tax Incidence of Three Texas Lottery Games: Regressivity, Race, and Education," National Tax Journal, 52, 74151.
PROPPER, C. (1990): "Contingent Valuation of Time Spent on NHS Waiting Lists," The Economic Journal, 100, 19399.
QUISUMBING, A., L. HADDAD, AND C. PENA (2001): "Are Women Overrepresented among the Poor? An Analysis of Poverty in 10 Developing Countries," Journal of Development Economics, 66, 22569.
RADNER, D. (1997): "Noncash Income, Equivalence Scales, and the Measurement of Economic WeilBeing," Review of Income and Wealth, 43, 7188.
RADY, T. (2000): "Poverty and Inequality in Egypt Between 1974/75 and 1995/96: The Sensitivity of the Poverty Levels to the Choice of the Concept of Poverty, the Poverty Measure, the Unit of Analysis, and the Poverty Line," Ph.d, Northern Illinois University.
RAO, R. (1973): Linear Statistical Inference and Its Applications, New York: John Wiley and Sons Inc.
RAO, V. (2000): "Price Heterogeneity and "Real" Inequality: A Case Study of Prices and Poverty in Rural South India," Review of Income and Wealth, 46, 20111.
RAVALLION, M. (1992): "Does Undernutrition Respond to Income and Prices? Dominance Tests for Indonesia," World Bank Economic Review, 6, 10924.
______(1994): Poverty Comparisons, Fundamentals of Pure and Applied Economics, Switzerland.: Harwood Academic Publishers.
______(1996): "Issues in Measuring and Modelling Poverty," The Economic Journal, 106, 13281343.
______(1998a): "Does Aggregation Hide the Harmful Effects of Inequality on Growth?" Economics Letters, 61, 7377.
______(1998b): "Poverty Lines in Theory and Practice," LSMS Working Paper 133, The World Bank.
______ (1999): "Are Poorer States Worse at Targeting Their Poor?" Economics Letters, 65, 37377.
______(2001): "Growth, Inequality and Poverty: Looking Beyond Averages," World Development, 29, 180315.
______(2002): "Are the Poor Protected from Budget Cuts? Evidence for Argentina," Journal of Applied Economics, 5, 95121.
RAVALLION, M. AND B. BIDANI (1994): "How Robust Is a Poverty Profile?" World Bank Economic Review, 8, 75102.
RAVALLION, M. AND S. CHEN (1997): "What Can New Survey Data Tell Us about Recent Changes in Distribution and Poverty?" World Bank Economic Review, 11, 35782.
______(2003): "Measuring Propoor Growth," Economics Letters, 78, 9399.
RAVALLION, M. AND G. DATT (2002): "Why Has Economic Growth Been More Propoor in Some States of India Than Others?" Journal of Development Economics, 68, 381400.
RAVALLION, M. AND M. LOKSHIN (2002): "SelfRated Economic Welfare in Russia," European Economic Review, 46, 145373.
RAVALLION, M., D. VAN DE WALLE, AND M. GAUTAM (1995): "Testing a Social Safety Net," Journal of Public Economics, 57, 17599.
RAWLS, J. (1971): A Theory of Justice, Cambridge: MA: Harvard University Press.
______(1974): "Some Reasons for the Maximin Criterion," American Economic Review.
REED, D. AND M. CANCIAN (2001): "Sources of Inequality: Measuring the Contributions of Income Sources to Rising Family Income Inequality," Review of Income and Wealth, 47, 32133.
RENWICK, T. AND B. BERGMANN (1993): "A BudgetBased Definition of Poverty: With an Application to SingleParent Families," Journal of Human Resources, 28, 124.
REYNOLDS, M. AND E. SMOLENSKY (1977): Public Expenditure, Taxes and the Distribution of Income: The United States, 1950, 1961, 1970, New York: Academic Press.
RODGERS, G., C. GORE, AND J. FIGUEIREDO, eds. (1995): Social Exclusion: Rhetoric, Reality, Responses, Geneva: International Labour Organization, International Institute for Labour Studies.
RODGERS, J. AND J. RODGERS (2000): "Poverty Intensity in Australia," Australian Economic Review, 33, 23544.
RONGVE, I. (1997): "Statistical Inference for Poverty Indices with Fixed Poverty Lines," Applied Economics, 29, 38792.
RONGVE, I. AND C. BEACH (1997): "Estimation and Inference for Normative Inequality Indices," International Economic Review, 38, 8396.
ROSEN, H. (1978): "An Approach to the Study of Income, Utility, and Horizontal Equity," Quarterly Journal of Economics, 92, 307322.
ROTHSCHILD, M. AND J. STIGLITZ (1973): "Some Further Results on the Measurement of Inequality," Journal of Economic Theory, 6, 188204.
ROWNTREE, S. (1901): Poverty, a study of town life, London: MacMillan.
RUGGERI, G., D. VAN WART, AND R. HOWARD (1994): "The Redistributional Impact of Government Spending in Canada," Public Finance, 49, 21243.
RUGGLES, P. AND M. O'HIGGINS (1981): "The Distribution of Public Expenditure among Households in the United States," The Review of Income and Wealth, 27.
RUIZ CASTILLO, J. (1998): "A Simplified Model for Social Welfare Analysis: An Application to Spain, 197374 to 198081," Review of Income and Wealth, 44, 12341.
RUNCIMAN, W. (1966): Relative Deprivation and Social Justice: A Study of Attitudes to Social Inequality in TwentiethCentury England, Berkeley and Los Angeles: University of California Press.
RYU, H. AND D. SLOTTJE (1999): "Parametric Approximations of the Lorenz Curve," in Handbook of income inequality measurement. With a foreword by Amartya Sen. Recent Economic Thought Series, ed. by J. Silber, Boston; Dordrecht and London: Kluwer Academic, 291312.
SA AADU, J., J. SHILLING, AND C. SIRMANS (1991): "Horizontal and Vertical Inequities in the Capital Gains Taxation of OwnerOccupied Housing," Public Finance Quarterly, 19, 47785.
SAHN, D. AND D. STIFEL (2000): "Poverty Comparisons Over Time and Across Countries in Africa," World Development, 28, 212355.
______ (2002): "Robust Comparisons of Malnutrition in Developing Countries," American Journal of Agricultural Economics, 84, 71635.
SAHN, D., S. YOUNGER, AND K. SIMLER (2000): "Dominance Testing of Transfers in Romania," Review of Income and Wealth, 46, 30927.
SALAS, R. (1998): "WelfareConsistent Inequality Indices in Changing Populations: The Marginal Population Replication Axiom: A Note," Journal of Public Economics, 67, 14550.
SARABIA, J. M., E. CASTILLO, AND D. SLOTTJE (1999): "An Ordered Family of Lorenz Curves," Journal of Econometrics, 91, 4360.
______(2001): "An Exponential Family of Lorenz Curves," Southern Economic Journal, 67, 74856.
SASTRY, D. AND U. KELKAR (1994): "Note on the Decomposition of Gini Inequality," Review of Economics and Statistics, 76, 58486.
SAUNDERS, P. (1994): Welfare and Inequality: National and International Perspectives on the Australian Welfare State, Cambridge University Press.
SCHADY, N. (2002): "Picking the Poor: Indicators for Geographic Targeting in Peru," Review of Income and Wealth, 48, 41733.
SCHLUTER, C. AND M. TREDE (2002a): "Statistical Inference for Inequality and Poverty Measurement with Dependent Data," International Economic Review, 43, 493508.
______(2002b): "Tails of Lorenz Curves," Journal of Econometrics, 109, 15166.
SCHULTZ, P. (1998): "Inequality in the Distribution of Personal Income in the World: How It Is Changing and Why," Journal of Population Economics, 11, 30744.
SCHWARZ, H. AND B. GUSTAFSSON (1991): "Income Redistribution Effects of Tax Reforms in Sweden," Journal of Policy Modeling, 13, 551570.
SCHWARZE, J. (1996): "How Income Inequality Changed in Germany following Reunification: An Empirical Analysis Using Decomposable Inequality Measures," Review of Income and Wealth, 42, 111.
SEFTON, T. (2002): "Targeting Fuel Poverty in England: Is the Government Getting Warm?" Fiscal Studies, 23, 36999.
SEN, A. (1973): On Economic Inequality, Oxford Clarendon Press.
______(1976): "Poverty: An Ordinal Approach to Measurement," Econometrica, 44, 219 231.
______(1981): Poverty and Famine: An Essay on Entitlement and Deprivation, Clarendon Press, Oxford University Press.
______(1982): "Equality of What?" in Choice, Welfare and Measurement, Cambridge, Mass.:MIT Press, vol. Chapter 16.
______(1983): "Poor, Relatively Speaking," Oxford Economic Papers, 35, 153169.
______(1985): Commodities and Capabilities, Amsterdam: NorthHolland.
______(1992): Inequality Reexamined, New York, Cambridge: Harvard University Press.
SHAPLEY, L. (1953): "A value for nperson games," in Contributions to the Theory of Games, ed. by H. W. Kuhn and A. W. Tucker, Princeton: Princeton University Press, vol. 2 of Annals of Mathematics Studies, 303317.
SHORROCKS, A. (1980): "The Class of Additively Decomposable Inequality Measures," Econometrica, 48, 613625.
______(1982): "Inequality Decomposition by Factor Components," Econometrica, 50, 193211.
______(1983): "Ranking Income Distributions," Economica, 50, 317.
______(1984): "Inequality Decomposition by Population Subgroups," Econometrica, 52, 13691385.
______(1987): "Transfer Sensitive Inequality Measures," Review of Economic Studies, LIV, 485497.
______(1995): "Revisiting the Sen Poverty Index," Econometrica, 63, 122530.
______(1998): "Deprivation Profiles and Deprivation Indices," in The distribution of welfare and household production: International perspectives, ed. by S. P. Jenkins, A. Kapteyn, and B. M. S. van Praag, Cambridge; New York and Melbourne: Cambridge University Press, 25067.
______(1999): "Decomposition procedures for distributional analysis: A unified framework based on the Shapley value," Tech. rep., University of Essex.
SILBER, J. (1989): "Factor Components, Population Subgroups and the Computation of the Gini Index of Inequality," The Review of Economics and Statistics, 71, 107115.
______(1993): "Inequality Decomposition by Income Source: A Note," Review of Economics and Statistics, 75, 54547.
SILVER, H. (1994): "Social Exclusion and Social Solidarity: Three Paradigms," International Labour Review, 133, 531576.
SILVERMAN, B. (1986): Density Estimation for Statistics and Data Analysis, London: Chapman and Hall.
SKOUFIAS, E. (2001): "Changes in Regional Inequality and Social Welfare in Indonesia from 1996 to 1999," Journal of International Development, 13, 7391.
SLEMROD, J. (1990): "Optimal Taxation and Optimal Tax Systems," Journal of Economic Perspectives, 4, 157178.
SLESNICK, D. (1993): "Gaining Ground: Poverty in the Postwar United States," Journal of Political Economy, 101, 138.
______(1996): "Consumption and Poverty: How Effective Are InKind Transfers?" Economic Journal, 106, 152745.
______(2002): "Prices and Regional Variation in Welfare," Journal of Urban Economics, 51, 44668.
SLITOR, R. (1948): "The Measurement of Progressivity and Builtin Flexibility," Quarterly Journal of Economics, 62, 30913.
SMEEDING, T. AND J. CODER (1995): "Income Inequality in Rich Countries during the 1980s," Journal of Income Distribution, 5, 1329.
SMEEDING, T., L. RAINWATER, AND M. O'HIGGINS, eds. (1990): Poverty, Inequality, andthe Distribution of Income in an International Context: Initial Research from the Luxembourg Income Study (LIS), London.: Wheatsheaf Books.
SMEEDING, T., P. SAUNDERS, J. CODER, S. JENKINS, J. FRITZELL, A. HAGENAARS, R. HAUSER, AND M. WOLFSON (1993): "Poverty, Inequality, and Family Living Standards Impacts across Seven Nations: The Effect of Noncash Subsidies for Health, Education and Housing," Review of Income and Wealth, 39, 22956.
SMEEDING, T. AND D. WEINBERG (2001): "Toward a Uniform Definition of Household Income," Review of Income and Wealth, 47, 124.
SMITH, A. (1776): An inquiry into the nature and causes of the wealth of nations, London.: Home University Library.
SON, H. (2004): "A note on propoor growth," Economics Letters, 82, 307314.
SOTOMAYOR, O. (1996): "Poverty and Income Inequality in Puerto Rico, 196989: Trends and Sources," Review of Income and Wealth, 42, 4961.
SPENCER, B. AND S. FISHER (1992): "On Comparing Distributions of Poverty Gaps," The Indian Journal of Statistics, 54, 11426.
STANOVNIK, T. (1992): "Perception of Poverty and Income Satisfaction: An Empirical Analysis of Slovene Households," Journal of Economic Psychology, 13, 5769.
STERN, N. (1982): "Optimum Taxation with Errors in Administration," Journal of Public Economics, 17, 181211.
______(1984): "Optimum Taxation and Tax Policy," IMF Staff Papers, 339378.
STIGLITZ, J. (1982): "Utilitarianism and Horizontal Equity: The Case for Random Taxation," Journal of Public Economics, 18, 133.
STODDER, J. (1991): "EquityEfficiency Preferences in Poland and the Soviet Union: OrderReversals under the Atkinson Index," Review of Income and Wealth, 37, 28799.
STRANAHAN, H. AND M. O. BORG (1998): "Horizontal Equity Implications of the Lottery Tax," National Tax Journal, 51, 7182.
STREETEN, P., S. BURKI, M. UL HAQ, N. HICKS, AND F. STEWART (1981): First Things First. Meeting Basic Human Needs in the Developing World, New York and Oxford: World Bank and Oxford University Press.
SUBRAMANIAN, S. (2002): "An Elementary Interpretation of the Gini Inequality Index," Theory and Decision, 52, 37579.
SUITS, D. (1977): "Measurement of Tax Progressivity," American Economic Review, 67, 74752.
SUTHERLAND, H. (1996): "Households, Individuals and the Redistribution of Income," Tech. Rep. University of Cambridge, Department of Applied Economics Working Paper, Amalgamated Series: 9614.
SZULC, A. (1995): "Measurement of Poverty: Poland in the 1980's," The Review of Income and Wealth, 41, 191206.
TAKAYAMA, N. (1979): "Poverty, Income Inequality, and their Measures: Professor sen's Axiomatic Approach Reconsidered," Econometrica, 47, 47759.
TAM, M. Y. AND R. ZHANG (1996): "Ranking Income Distributions: The Tradeoff between Efficiency and Equality," Economica, 63, 23952.
THISTLE, P. (1988): "Uniform Progressivity, Residual Progression, and SingleCrossing," Journal of Public Economics, 37, 12126.
______(1990): "Large Sample Properties of two Inequality Indices," Econometrica, 58, 72528.
THON, D. (1979): "On Measuring Poverty," Review of Income and Wealth, 25, 429 439.
THORBECKE, E. AND D. BERRIAN (1992): "Budgetary Rules to Minimize Societal Poverty in a General Equilibrium Context," Journal of Development Economics, 39, 189205.
TOWNSEND, P. (1979): Poverty in the United Kingdom: A Survey of Household Resources and Standards of Living, Berkeley: University of California Press.
TSUI, K. Y. (1996): "GrowthEquity Decomposition of a Change in Poverty: An Axiomatic Approach," Economics Letters, 50, 41723.
______(1998): "Trends and Inequalities of Rural Welfare in China: Evidence from Rural Households in Guangdong and Sichuan," Journal of Comparative Economics, 26, 783804.
TUOMALA, M. (1990): Optimal Income Tax and Redistribution, Oxford: Clarendon Press.
VAN DE GAER, D., N. FUNNELL, AND T. MCCARTHY (1999): "Statistical Inference for Two Measures of Inequality When Incomes Are Correlated," Economics Letters, 64, 295300.
VAN DE VEN, J., J. CREEDY, AND P. LAMBERT (2001): "Close Equals and Calculation of the Vertical, Horizontal and Reranking Effects of Taxation," Oxford Bulletin of Economics and Statistics, 63, 38194.
VAN DE WALLE, D. (1998a): "Assessing the Welfare Impacts of Public Spending," World Development, 26, 36579.
______(1998b): "Targeting Revisited," World Bank Research Observer, 13, 23148.
VAN DE WALLE, D. AND K. NEAD (1995): Public spending and the poor: Theory and evidence, Baltimore and London: Johns Hopkins University Press for the World Bank.
VAN DEN BOSCH, K. (1998): "Poverty and Assets in Belgium," Review of Income and Wealth, 44, 21528.
VAN DEN BOSCH, K., T. CALLAN, P. ESTIVILL, P. HAUSMAN, B. JEANDIDIER, R. MUFFELS, AND J. YFANTOPOULOS (1993): "A Comparison of Poverty in Seven European Countries and Regions Using Subjective and Relative Measures," Journal of Population Economics, 6, 235259.
VAN DOORSLAER, E., A. WAGSTAFF, H. VAN DER BURG, T. CHRISTIANSEN, G. CITONI, R. DI BIASE, U.G. GERDTHAM, M. GERFIN, L. GROSS, AND U. HAKINNEN (1999): "The Redistributive Effect of Health Care Finance in Twelve OECD Countries," Journal of Health Economics, 18, 291313.
VERMAETEN, A., I. GILLESPIE, AND F. VERMAETEN (1995): "Who Paid the Taxes in Canada, 19511988?" Canadian Public Policy, 21, 31743.
VIARD, A. (2001): "Optimal Categorical Transfer Payments: The Welfare Economics of Limited LumpSum Redistribution," Journal of Public Economic Theory, 3, 483500.
VICKREY, W. (1972): Agenda for Progressive Taxation, New York: NY: Ronald Press, 1st ed.
WAGSTAFF, A. AND E. VAN DOORSLAER (1997): "Progressivity, horizontal equity and reranking in health care finance: a decomposition analysis for the Netherlands," Journal of Health Economics, 16, 499516.
______(2001): "What Makes the Personal Income Tax Progressive? A Comparative Analysis of Fifteen OECD Countries," International Tax and Public Finance, 8, 299315.
WAGSTAFF, A., E. VAN DOORSLAER, V. HATTEM, S. CALONGE, T. CHRISTIANSEN, G. CITONI, U. GERDTHAM, M. GERFIN, L. GROSS, AND U. HAKINNEN (1999): "Redistributive Ef
fect, Progressivity and Differential Tax Treatment: Personal Income Taxes in Twelve OECD Countries," Journal of Public Economics, 72, 7398.
WANE, W. (2001): "The Optimal Income Tax When Poverty Is a Public 'Bad.'," Journal of Public Economics, 82, 27199.
WANG, Q., G. SHI, AND Y. ZHENG (2002): "Changes in Income Inequality and Welfare under Economic Transition: Evidence from Urban China," Applied Economics Letters, 9, 98991.
WANG, Y. Q. AND K. Y. TSUI (2000): "A New Class of DeprivationBased Generalized Gini Indices," Economic Theory, 16, 36377.
WATTS, H. W. (1968): "An Economic Definition of Poverty," in Understanding Poverty, ed. by D. Moynihan, New York: Basic Books.
WEYMARK, J. (1981): "Generalized Gini Inequality Indices," Mathematical Social Sciences, 1, 40930.
WHITMORE, G. (1970): "ThirdDegree Stochastic Dominance," The American Economic Review, 60, 45759.
WILDASIN, D. (1984): "On Public Good Provision With Distortionary Taxation," Economic Inquiry, 22, 227243.
WODON, Q. (1997a): "Food Energy Intake and Cost of Basic Needs: Measuring Poverty in Bangladesh," Journal of Development Studies, 34, 66101.
______(1997b): "Targeting the Poor Using ROC Curves," World Development, 25, 208392.
______(1999): "Between Group Inequality and Targeted Transfers," Review of Income and Wealth, 45, 2139.
WODON, Q. AND S. YITZHAKI (2002): "Evaluating the Impact of Government Programs on Social Welfare: The Role of Targeting and the Allocation Rules among Program Beneficiaries," Public Finance Review, 30, 10223.
WOLAK, F. (1989): "Testing Inequality Constraints in Linear Econometric Models," Journal of Econometrics, 41, 20535.
WOLFF, E. (1998): "Recent Trends in the Size Distribution of Household Wealth," Journal of Economic Perspectives, 12, 13150.
WOOLLEY, F. AND J. MARSHALL (1994): "Measuring Inequality within the Household," Review of Income and Wealth, 40, 41531.
WORLD BANK,. (1990): World Development Report 1990: Poverty, Washington: World Bank.
XU, K. (1997): "Asymptotically DistributionFree Statistical Test for Generalized Lorenz Curves: An Alternative Approach," Journal of Income Distribution, 7, 4562.
______(1998): "Statistical Inference for the SenShorrocksThon Index of Poverty Intensity," Journal of Income Distribution, 8, 14352.
______(2000): "Inference for Generalized Gini Indices Using the IteratedBootstrap Method," Journal of Business and Economic Statistics, 18, 22327.
XU, K. AND L. OSBERG (1998): A DistributionFree Test for Deprivation Dominance, Halifax: Department of Economics, Dalhousie University.
YAARI, M. (1987): "The Dual Theory of Choice under Risk," Econometrica, 55, 95115.
______(1988): "A Controversial Proposal Concerning Inequality Measurement," Journal of Economic Theory, 44, 38197.
YAO, S. (1997): "Decomposition of Gini Coefficients by Income Factors: A New Approach and Application," Applied Economics Letters, 4, 2731.
YATES, J. (1994): "Imputed Rent and Income Distribution," Review of Income and Wealth, 40, 4366.
YITZHAKI, S. (1979): "Relative Deprivation and the Gini Coefficient," Quarterly Journal of Economics, 93, 321324.
______(1982a): "Relative Deprivation and Economic Welfare," European Economic Review, 17, 99113.
______(1982b): "Stochastic Dominance, Mean Variance, and Gini Mean Difference," American Economic Review, 72, 17885.
______(1983): "On an Extension of the Gini Index," International Economic Review, 24, 617628.
______(1991): "Calculating Jackknife Variance Estimators for Parameters of the Gini Method," Journal of Business and Economic Statistics, 9, 23539.
______(1997): "The Effect of Marginal Changes in Prices on Income Inequality in Romania," in Inequality and taxation. Research on Economic Inequality, vol. 7. Greenwich, Conn, ed. by S. Zandvakili, and London: JAI Press, 24158.
______(1998): "More Than a Dozen Alternative Ways of Spelling Gini," in Research on economic inequality. Volume 8. Stamford, Conn, ed. by D. J. Slottje, and London: JAI Press, 1330.
YITZHAKI, S. AND R. LERMAN (1991): "Income Stratification and Income Inequality," Review of Income and Wealth, 37, 31329.
YITZHAKI, S. AND J. LEWIS (1996): "Guidelines on Searching for a DaltonImproving Tax Reform: An Illustration with Data from Indonesia," The World Bank Economic Review, 10, 541562.
YITZHAKI, S. AND J. SLEMROD (1991): "Welfare Dominance: An Application to Commodity Taxation," American Economic Review, LXXXI, 48096.
YITZHAKI, S. AND W. THIRSK (1990): "Welfare Dominance and the Design of Excise Taxation in the Cote D'lvoire," Journal of Development Economics, 33, 118.
YOUNGER, S., D. SAHN, S. HAGGBLADE, AND P. DOROSH (1999): "Tax Incidence in Madagascar: An Analysis Using Household Data," World Bank Economic Review, 13, 30331.
ZAIDI, A. AND K. DE VOS (2001): "Trends in ConsumptionBased Poverty and Inequality in the European Union during the 1980s," Journal of Population Economics, 14, 36790.
ZANDVAKILI, S. (1994): "Income Distribution and Redistribution through Taxation: An International Comparison," Empirical Economics, 19, 47391.
______(1995): "Decomposable Measures of Income Tax Progressivity," Applied Economics, 27, 65760.
______(1999): "Income Inequality among Female Heads of Households: Racial Inequality Reconsidered," Economica, 66, 11933.
ZANDVAKILI, S. AND J. MILLS (2001): "The Distributional Implications of Tax and Transfer Programs in US," Quarterly Review of Economics and Finance, 41, 16781.
ZHENG, B. (1993): "An Axiomatic Characterization of the Watts Poverty Index," Economics Letters, 42, 8186.
______(1994): "Can a Poverty Index Be Both Relative and Absolute?" Econometrica, 62, 145358.
______(1997): "Aggregate Poverty Measures," Journal of Economic Surveys, 11, 12362.
______(1999a): "On the Power of Poverty Orderings," Social Choice and Welfare, 16, 34971.
______(1999b): "Statistical Inferences for Testing Marginal Rank and (Generalized) Lorenz Dominances," Southern Economic Journal, 65, 55770.
______(2000a): "Minimum DistributionSensitivity, Poverty Aversion, and Poverty Orderings," Journal of Economic Theory, 95, 11637.
______(2000b): "Poverty Orderings," Journal of Economic Surveys, 14, 42766.
______(2001a): "Poverty Orderings: A Graphical Illustration," Social Choice and Welfare, 18, 16578.
______(2001b): "Statistical Inference for Poverty Measures with Relative Poverty Lines," Journal of Econometrics, 101, 33756.
______(2002): "Testing Lorenz Curves with Nonsimple Random Samples," Econometrica, 70, 123543.
ZHENG, B. AND B. CUSHING (2001): "Statistical Inference for Testing Inequality Indices with Dependent Samples," Journal of Econometrics, 101, 31535.
ZHENG, B., B. CUSHING, AND V. CHOW (1995): "Statistical Tests of Changes in U.S. Poverty, 1975 to 1990," Southern Economic Journal, 62, 33447.
ZHENG, B., J. P. FORMBY, J. SMITH, AND V. CHOW (2000): "Inequality Orderings, Normalized Stochastic Dominance, and Statistical Inference," Journal of Business and Economic Statistics, 18, 47988.
ZOLI, C. (1999): "Intersecting Generalized Lorenz Curves and the Gini Index," Social Choice and Welfare, 16, 18396.
1 – f: FPC factor, 282
A: hypothetical distribution, 52, 92, 155
AD(z): absolute deprivation, 90
B: benefit, 136
B: hypothetical distribution, 52, 92, 156
Bi: level of gross benefit expended on individual i, 226
: benefit offered to i, 226
C: number of goods, 209
C: number of income components, 97, 212, 218
CB(p): concentration curve of benefit B at rank p, ordered in terms of X, 189
CF(ε): cost of inequality subsequent to a flat tax, 150
CN(ε): cost of inequality in the distribution of net income, 150
CN(p): concentration curve for N at rank p, ordered in terms of X, 129
CN(p): concentration curve for N at rank p, ordered in terms of X), 187
CT(P): concentration curve for taxes T at rank p, ordered in terms of X, 128
Ci: Shapley value of factor i, 71
C: cost of inequality in the distribution of locally pexpected utilities of net incomes, 150
(z): stochastic dominance curve of order s at z and for distribution A, 167, 176
E: equivalence scale, 31
E: household size, 31
F(y): distribution function, 52, 208
F(z): proportion of individuals underneath the poverty line z, 39, 156, 167, 206
FNx(·): distribution function of Nconditional on X being equal to x, 128
FX,N(·,·): joint distribution function of gross and net incomes, 128
G(p): proportional change in the Generalized Lorenz curve, 193
G(p; z): Cumulative Poverty Gap, 89, 175, 192
GCxc(p): generalized concentration curve for xc, 189, 209
GL(p): Generalized Lorenz curve, 65, 183, 193
I: index of inequality corresponding to the social welfare function W, 61
I(2): standard Gini index, 55
I(ρ): SGini inequality index, 5557, 131, 149
I(ρ, ε): AtkinsonGini inequality index, 61
I(θ): Generalized entropy inequality index, 67
I(k; θ): inequality within subgroup k, 68
ICX(c) (ρ): concentration index for X(c), 130, 220
IT(ρ): SGini indices of TRprogressivity, 144
IV(ρ): SGini indices of IRprogressivity and vertical equity, 144
I[x]: indicator function, 226
I*(z; ρ.ε): index of inequality in censored income, 89
IN(ε): Atkinson inequality index for N, 64, 149
J: number of taxes and benefits, 146
K: number of mutually exclusive population subgroups, 87
K(u): the multivariate Gaussian Kernel, 265
LA(p): Lorenz curve for A, 49, 50, 52, 129, 131, 187, 219
M: permutation matrix, 160
M: the population number of last sampling units (LSU), 285
MA: number of adults in household, 31
Mhi: the population number of last sampling units (LSU), 284
N: net income, 127
N(p): pquantile of net income, 148, 188
N(qp): qquantile of conditional distribution of N, 128, 147, 148
Nh: the population number of primary sampling units (PSU) in a stratum h, 284
Nj: observation j of N, 144
P(k; z; α): FGT poverty index of subgroup k, 87, 200
P(z; α): FosterGreerThorbecke(FGT) poverty index, 201
P(z; α = 1): average poverty gap, 169
P(z; ε): Clark, Hemming and Ulph poverty index, 82
P(z; ρ): SGini poverty index, 82
P(z; ρ, ε): AtkinsonGini poverty index, 81
PC(z; ε): Chakravarty poverty index, 82
PW(z): Watts poverty index, 82
Q(k; p): pquantile in a group k, 201
Q(p): pquantile, 4042, 50, 52, 128, 135, 165, 170
Q*(p; z): pquantile censored at z, 42, 81
QN(qp): qquantile function for net incomes conditional on a pquantile of X, 128
R(q): per capita net commodity tax revenues, 209
RD(z, ρ): relative deprivation in censored income, 90
S: set of players, 71
Shijk: the size of statistical unit hijk in the population, 285
T: taxes net of transfers, 127
T(X): deterministic portion of tax T at X, 127
T(j): component j of total tax T, 146
U: utility function, 110
U(y; ε): utility function of y with parameter ε, 60, 61, 6365, 148, 182
V(y, q, v): indirect utility function, 23
V(y, q; v): indirect utility function, 207
W: social welfare function, 60, 159, 182
W(ε): Atkinson social welfare function, 63, 64, 149
W(ρ): SGini social welfare function, 65
W(ρ, ε): AtkinsonGini social welfare function, 63
W(ε): Atkinson social welfare function with locally pexpected net incomes, 148
W(ε): Atkinson social welfare function with locally pexpected utilities of net incomes, 148
X: gross income, 127, 188, 212
X(p): pquantile of gross income, 147, 188
Xj: observation j of X, 144
X(c): income component c of total income X, 212, 218
X(c): type c of total expenditure X, 130
Xhijk: the value of the variable of interest for statistical unit hijk, 285
Y: food budget, 109
Y: population total of the x's, 274
Y: the population total of interest, 285
Δ P(z): difference in poverty indices, 167
Δ f(y): difference in the densities of income, 167
Γ (z): average poverty gap, 173
Ωs: class of sorder social welfare indices, 182, 211
Πs(z): class of sorder poverty indices, 165, 191, 211
θ: set of possible taste parameters, 208
s(l): class of sorder inequality indices, 184, 192
: a parameter space, 293
g(z): cost of inequality in poverty gaps, 88
g(z; α): cost of inequality in poverty for FGT indices, 89
α: level of statistical significance, 292
α: parameter of inequality aversion in measuring poverty, 88
α: vector of α k, 290
α = s  1: ethical parameter, 184
αk: population mean of yk, 290
: subsets of a set S of players, 71
: class of dual sorder poverty indices, 174
ε: parameter of relative inequality aversion, 57
η(k): equal amount to each member of a group k, 200
η: some positive value, 118, 159, 161, 162
γ: efficiency cost of taxing j relative to that of taxing l, 209
: estimator of the population number of LSU, 285
: estimator of population size N, 274
: estimator of Y, 274
: estimator of the population total of interest, 285
: generic estimator, 290
: estimator of μ, 278
: sample estimate of , 292
: generic estimator, 290
: confidence interval, 293
: estimator of the sampling variance under simple random sampling, 286
κ(p): weight used in linear indices, 53
κ (p; ρ): weight used in SGini indices, 53, 56
λ(k): proportional factor for group k, 203
λ: Lagrange multiplier, 227
λ: proportional factor, 56, 59, 62, 96, 217
: social opportunity cost of spending public funds, 227
λc: proportional factor for component c, 212, 218
y(k): vector of incomes in group k, 70
CDc(z;α): consumption dominance curve of a component c, 208
IC(ρ): SGini indices of concentration, 130
IR(ρ): SGini indices of redistribution, 144
LP(X): Liability Progression at X, 133
MV: marginal value of a player i to a coalition , 71
NBi: net benefit of state support i, 226
RP(X): Residual Progression at , 133
RR(ρ): SGini indices of reranking, 144
TR: tax or transfer, 187
deff: design effect, 286
x0: reference level, 29
E[ti]: the expected number of times unit i will appear in the sample, 274
μ(k): mean income in group k, 204
μ(k): mean income in subgroup k, 68
μ*(z): mean of censored quantiles, 42
μP(z): average income of the poor, 216
μg(z): average poverty gap, 83, 156
μF: average income under a welfareneutral flat tax, 150
μT: mean tax, 188
μX: mean of variable X, 41, 49, 97, 132, 188, 192, 213, 220, 278
v: stochastic tax determinant, 128
ω(p): weight on income used in linear indices, 60, 174, 182
ω(p;ρ): weight on income used in SGini indices, 55, 56
: government support per capita budget, 226
: expected benefit at rank p, 189
: normalized stochastic dominance curve, 186
: expected net income of those at rank p in the distribution of gross incomes, 144
: normalized FGT indices, 186, 208
: income shares or normalized quantiles, 192
= Q(p)/μ: income shares or normalized quantiles, 184
: expected net tax of those with income X in the distribution of gross incomes, 188
: expected net tax rate of those with X(p) in terms of gross incomes, 134
: expected utility at rank p, 148
: expected value of income component c at rank p in the distribution of total income X, 97, 132, 212
: average population shares, 96
: normalized poverty gap, 186
: mean in stratum h, 285
: expected relative deprivation at rank p, 59
: contribution of between subgroup inequality to total inequality, 69
: normalized CD curves, 210
ø(k): share of the population found in subgroup k, 68, 87, 96
π: poverty function, 226
π(Q(p);z): contribution of Q(p) to poverty index, 165
π(y;z): poverty function, 227
π(i)(Q(p);z): ith order derivative of π(Q(p);z) with respect to Q(p), 165
πi: probability that unit i is selected, 274
ρ(p): consumption ratio x2(p)/x1(p), 107
σ: number of players in coalition , 71
: conditional variance of T at gross income X(p), 147
: population variance of yk, 290
: variance of y conditional on some value x, 269
: variance of , 279
var(y): variance of the population yhij, 286
ε: random term, 266
εG(z;α): Gini elasticity of FGT poverty indices, 217
εy(k;z;α): elasticity of total poverty with respect to total income
with growth from component k, 213
εy(z;α): elasticity of total poverty with respect to total income, 214
ς: some positive value, 118
: estimator of f(y), 264
: the designbased estimator of the sampling variance, 286
ξ(ρ, ε): equally distributed equivalent (EDE) income, 61
ξ*(z;ρ, ε): equally distributed equivalent (EDE) income of censored income, 81
ξg(z): equally distributed (EDE) poverty gap, 88
ξg(z;α): equally distributed equivalent (EDE) poverty gap for the unnormalized FGT indices, 86, 88
ξN(ε): equally distributed equivalent (EDE) incomes for WN(ε), 149
: equally distributed equivalent (EDE) incomes for , 149
: equally distributed equivalent (EDE) incomes for , 149
ς+(s): upper bound of range of poverty lines over which dominance holds at order s, 176
c: child parameter in equivalence scale, 32
c: equivalence scale parameter, 169
c: redistributive costs, 233
ci: cost to i of accepting , 226
e: expenditure function, 23
e: net income for optimal recipients of state support, 231
f(Q(p)): density of income at pquantile, 52
f(k;z): density of income at z for group k, 201
f(y): density function, 39, 206, 260, 262264
f(i)(y): iorder derivative of function f, 263
fh: function of a userspecified FPC factor, 285
g(p): income growth curve at p, 191, 192
g(p; z): poverty gap at percentile p, 42, 86, 173
h*: optimal bandwidth, 264
l: proportion of mean as relative poverty line, 186
lμ: proportion of mean as relative poverty line, 186
l+: upper bound of range of proportions over dominance must be checked, 184, 188
l+: upper bound of range of proportions over dominance must be checked, 185
m: the number of LSU, 285
mhi: the number of selected LSU in the PSU hi, 284
n(k): number of individuals in group k, 70
nh: the number of selected PSU in a stratum h, 284
Pi = i/n: percentile corresponding to ordered observation i, 50
q: percentile, 52
qc: price of good c, 207
qhij: the number of observations in selected LSU hij, 284
r: number of randomly selected individuals from a population, 56
r: percentile, 52
s: class of indices, 158
s: number of players in set S, 71
s: order of stochastic dominance, 276
s: order of stochastic dominance, 158, 169
s: size parameter in equivalence scale, 31
t: average tax as a proportion of average gross income, 129
t: population subgroup, 96
t: vector of tax rates, 209
t(X): expected tax at X as a proportion of X, 133
ti: number of times unit i appears in a random sample of size n, 274
tl: tax on l, 211
tl: tax rate on good l, 209211
t(j): average tax T(j) over average gross income X, 146
w(u): weight function, 260
Wi: the sampling weight of observation i, 265
Whij: the sampling weight of LSU hij, 285
xc: consumption of commodity c, 109, 209
xc(q): consumption of commodity c with prices q, 208
xc(y, q): expected consumption of good c at income y when facing prices q, 208
xc(y, q, v): consumption of good c at income y and preferences v, when facing prices q, 207
xhijk: the value of X that appears in the sample for sample observation hijk, 285
y: income, 207
y: total nominal expenditure, 23
yR: Equivalent consumption expenditure, 23
yR: real income, 207
yi: income of individual i, 41, 50, 161, 226
yhijk: sum of y in LSU hij, 285
yhi: relevant of y in PSU hi, 285
z: poverty line, 165, 167, 206
z(p): pquantile of the standardized normal distribution, 292
z*: poverty line, 119
zk: minimum calorie intake recommended for a healthy life, 114
zF: minimal food expenditure necessary for living in good health, 106
zT: total poverty line, 106
zNF: required nonfood expenditures, 106
: vector of n income levels, 159, 162
y: vector of n income levels, 159, 162
Aaberge, R., 33, 35, 72, 139, 297
Achdut, L., 139
Ahmad, E., 222<