Home About us Contact | |||
Reward Structures (reward + structure)
Selected AbstractsConcurrent Q-learning: Reinforcement learning for dynamic goals and environmentsINTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, Issue 10 2005Robert B. Ollington This article presents a powerful new algorithm for reinforcement learning in problems where the goals and also the environment may change. The algorithm is completely goal independent, allowing the mechanics of the environment to be learned independently of the task that is being undertaken. Conventional reinforcement learning techniques, such as Q-learning, are goal dependent. When the goal or reward conditions change, previous learning interferes with the new task that is being learned, resulting in very poor performance. Previously, the Concurrent Q-Learning algorithm was developed, based on Watkin's Q-learning, which learns the relative proximity of all states simultaneously. This learning is completely independent of the reward experienced at those states and, through a simple action selection strategy, may be applied to any given reward structure. Here it is shown that the extra information obtained may be used to replace the eligibility traces of Watkin's Q-learning, allowing many more value updates to be made at each time step. The new algorithm is compared to the previous version and also to DG-learning in tasks involving changing goals and environments. The new algorithm is shown to perform significantly better than these alternatives, especially in situations involving novel obstructions. The algorithm adapts quickly and intelligently to changes in both the environment and reward structure, and does not suffer interference from training undertaken prior to those changes. © 2005 Wiley Periodicals, Inc. Int J Int Syst 20: 1037,1052, 2005. [source] Equal Pay for Unequal Work: Limiting Sabotage in TeamsJOURNAL OF ECONOMICS & MANAGEMENT STRATEGY, Issue 1 2010Arup Bose We demonstrate the value of "equal pay" policies in teams, even when team members have distinct abilities and make different contributions to team performance. A commitment to compensate all team members in identical fashion eliminates the incentive that each team member otherwise has to sabotage the activities of teammates in order to induce the team owner to implement a more favorable reward structure. The reduced sabotage benefits the team owner, and can secure Pareto gains under plausible circumstances. [source] Green light for greener supplyBUSINESS ETHICS: A EUROPEAN REVIEW, Issue 4 2002Lutz Preuss The supply chain management function is currently undergoing a dramatic change: it is adopting an increasingly strategic role. However, this growing financial importance is matched in only a handful of exemplary companies by a greater contribution to environmental protection initiatives in the supply chain. This paper explores some of the obstacles to greater supply chain management involvement in environmental protection and offers suggestions for greener supply. At a personal level, the gap between public opinion on the environment and managerial values needs to be closed, and the support offered by management education and by professional bodies needs to be improved. Within the organisation, the reward structure for supply chain managers needs to move away from narrow economic criteria. Greener supply would also benefit from a larger supply chain management role in corporate strategy making; the function could even be offered a seat on the Board of Management. Changes to the mode of supply chain management, including improvements to the information flow on environmental issues, the decision,making tools used in the face of complex environmental challenges and novel approaches to supply chain management need to receive urgent attention. [source] Curricular Planning along the Fault Line between Instrumental and Academic Agendas: A Response to the Report of the Modern Language Association on Foreign Languages and Higher Education: New Structures for a Changed World,DIE UNTERRICHTSPRAXIS/TEACHING GERMAN, Issue 2 2009Ingeborg Walther In calling for new governance structures and unified curricula, the MLA Report distinguishes between instrumental and constitutive views of language that characterize our often schizophrenic agendas of language acquisition on the one hand, and disciplinary knowledge on the other. This paper explores some common theoretical insights from the fields of language acquisition and cultural studies that interrogate these views, providing a basis for sustained collaboration around curricula among faculty on both sides of the divide. While these have already yielded the kinds of curricular innovations recommended by the Report, a case is made for more radical changes in hiring practices, distribution of teaching and service, reward structures, and graduate education , changes which have the capacity to transform the institutional values upon which they will also depend. [source] Subjective neuronal coding of reward: temporal value discounting and riskEUROPEAN JOURNAL OF NEUROSCIENCE, Issue 12 2010Wolfram Schultz Abstract A key question in the neurobiology of reward relates to the nature of coding. Rewards are objects that are advantageous or necessary for the survival of individuals in a variety of environmental situations. Thus reward appears to depend on the individual and its environment. The question arises whether neuronal systems in humans and monkeys code reward in subjective terms, objective terms or both. The present review addresses this issue by dealing with two important reward processes, namely the individual discounting of reward value across temporal delays, and the processing of information about risky rewards that depends on individual risk attitudes. The subjective value of rewards decreases with the temporal distance to the reward. In experiments using neurophysiology and brain imaging, dopamine neurons and striatal systems discount reward value across temporal delays of a few seconds, despite unchanged objective reward value, suggesting subjective value coding. The subjective values of risky outcomes depend on the risk attitude of individual decision makers; these values decrease for risk-avoiders and increase for risk-seekers. The signal for risk and the signal for the value of risky reward covary with individual risk attitudes in regions of the human prefrontal cortex, suggesting subjective rather than objective coding of risk and risky value. These data demonstrate that important parameters of reward are coded in a subjective manner in key reward structures of the brain. However, these data do not rule out that other neurons or brain structures may code reward according to its objective value and risk. [source] How Green Was My Valley?LAW & SOCIETY REVIEW, Issue 4 2003An Examination of Tournament Theory as a Governance Mechanism in Silicon Valley Law Firms The tournament model is a widely used mechanism to control opportunistic behavior by associates in law firms. However, this mechanism can only operate in certain economic (and social) circumstances. When those circumstances do not exist, the model breaks down, and with it the ability to control opportunism in the absence of some alternative mechanism. Prior research has not investigated whether the utilization of a tournament model prevents the opportunistic behaviors identified as grabbing, leaving, and shirking. In order to test the limits of the tournament model, it is necessary to find particular historical moments when the economic environment radically challenges assumptions/premises of the model. The dot-com bubble in Silicon Valley provides precisely such a time and place. This article demonstrates limits to the applicability of tournament theory. Those limits are to be found in the economic environment in circumstances in which: (1) exogenous reward structures offer many multiples of internal rewards; (2) demonstrably high short-term rewards outside the firm starkly contrast with the delayed long-term rewards inside the firm; (3) the managerial strata reduce their emphasis on long-term recruiting of potential partners in favor of short-term productivity by young associates; and (4) firms develop departmental leverage ratios in excess of their capacity to monitor, mentor, and train recruits. [source] Employee empowerment in manufacturing: a study of organisations in the UKNEW TECHNOLOGY, WORK AND EMPLOYMENT, Issue 2 2002Anna Psoinos Based on a postal survey and interviews, this paper analyses employee empowerment in the UK manufacturing industry, including how it is pursued and perceived, and the key factors that determine success. Success seems to depend on far-reaching changes in procedures, hierarchies and reward structures. This need to mobilise individual agents and structure reconfirms the agency-structure duality. [source] |