Human Value Detection 2024

ValueEval'24

Synopsis
Important Dates
Task
Data
Submission
Evaluation
Related Work
Task Committee

Synopsis

Sub-Task 1: Given a text, for each sentence, detect which human values the sentence refers to.
Sub-Task 2: Given a text, for each sentence and human value this sentence refers to, detect whether this reference (partially) attains or (partially) constrains the value.
Communication: [mailing lists: task, organizers] [twitter/x]
Data: [download] [project]
Submission: [example approaches] [evaluator] [forum] [submit]
ValueEval'23: [website] [demo]

Important Dates

Subscribe to the mailing list to receive notifications.

Dec. 18, 2023: CLEF Registration opens. [register]
May 6, 2024: Approaches submission deadline.
May 31, 2024: Participant paper submission.
June 21, 2024: Peer review notification.
July 8, 2024: Camera-ready participant papers submission.
Sep. 9-12, 2024: CLEF Conference in Grenoble and Touché Workshop.

All deadlines are 23:59 anywhere on earth (UTC-12).

Task

Identify human values (sub-task 1) and their attainment (sub-task 2) in long texts in eight languages.

The task employs a collection of roughly 3000 human-annotated texts between 400 and 800 words that is created as part of ValuesML. The annotated values are those of the Schwartz' value continuum (see below and click to see the value's description):

Self-direction: thought

Freedom to cultivate one's own ideas and abilities.

Personal motivation: It is important to be creative, forming own opinions, be unique, have original ideas, learn things for oneself and improve own abilities.

The focus of this value is on developing own ideas, wanting to know more and discovering.

Self-direction: action

Freedom to determine one's own actions.

Personal motivation: It is important to make own decisions about life, being independent and having the freedom to choose.

The focus of this value in contrast to the "self-direction: thought" aspect is to determine an action, rather than a conviction or thought.

Stimulation

Excitement, novelty, and change.

Personal motivation: Always looking for something new to do, doing something exciting, seeking out new experiences, innovating, being bold, seeking adventures and initiating change.

In contrast to "hedonism", this value focuses on the novelty and risk aspects of behaviours and thoughts. It is seeking out everything that stimulates the senses.

Hedonism

Pleasure and sensuous gratification.

Personal motivation: Having a good time, enjoying life’s pleasures and taking advantage of opportunities to have fun.

Achievement

Success according to social standards.

Personal motivation: Being ambitious, successful and being admired for achievements and skills. Demonstrating competence according to social standards or in competition.

Important dimension of achievement is that it is perceived within the social standards, and according to the rules of engagement, unlike as for the value of power. Moreover, unlike the value of power, Achievement focuses on performance and not on resource matters.

Power: dominance

Power through exercising control over people.

Personal motivation: Want people to follow you, being the most influential compared to others, be the one to determine directions.

Power: resources

Power through control of material and social resources.

Personal motivation: Having lots of money for the power it brings, being wealthy, and pursue social status.

This value is not automatically present when terms like budgets, costs, or growth appear in a sentence. Such terms refer to the value only if they make up an essential part of a sentence and are expressed as an important part of decision making with an underlying motivation or justification. For example, when someone is writing about the "huge", "large", "out of control" costs, or in any other way mentioned as important argument. Finally, in contrast to the value of "achievement", it only focuses on resource matters not on performance.

Face

Security and power through maintaining one’s public image and avoiding humiliation.

Personal motivation: Does not want to be shamed by others, protecting public image, being treated with respect, honour, and dignity.

Security: personal

Safety in one's immediate environment.

Personal motivation: Avoid dangerous situations, value personal security and safety, live in a secure environment, have a secured income, being healthy.

Similar to the difference between "benevolence" and "universalism: concern", the decisive difference between "security: personal" and "security: societal" is to whom the value refers. In case of individuals, family and friends, it is personal, e.g., if someone writes in first-person about their own health. In contrast, when it is applied to any group or society as a whole, it is societal. Thus, "security: personal" will most likely not appear often in the text only if speaking of individual experiences.

Security: societal

Safety and stability in the wider society.

Personal motivation: Country should protect itself against all threats, state should be strong, order and stability in society are important, including economic stability (employment, no recession). Importantly, the value refers not only to a society as a whole but also to socially defined groups like women, parents, etc. within a society. In contrast to the value of "universalism: concern", the value emphasises protection more from a motivation of "preventing", "averting", "ending" dangers or threats and "preserving" security and stability.

Tradition

Maintaining and preserving cultural, family, or religious traditions.

Personal motivation: Maintain traditional beliefs and values, follow the family or religious customs, valuing traditional practices of one’s culture.

Conformity: rules

Compliance with rules, laws, and formal obligations.

Personal motivation: Should follow authorities, follow rules even if others are not watching, obey all laws.

Conformity: interpersonal

Avoidance of upsetting or harming other people.

Personal motivation: Avoid upsetting or annoying others, being tactful to others, showing courtesy, being polite, resisting temptation, respecting elders.

Humility

Recognizing one's insignificance in the larger scheme of things.

Personal motivation: Try not to draw attention, be humble and satisfied with the situation, not asking for more.

Benevolence: caring

Devotion to the welfare of in-group members.

Personal motivation: Help and care about close ones, be responsive to family and friends. Actively helping and taking care of someone close.

A key difference between "benevolence: caring" and "benevolence: dependability" is that the caring dimension is active about doing something, while the dependability is more about the perception of being there for someone, being trustworthy etc. Furthermore, caring here is not generic or universal, it is limited to specific persons or a delineated in-group within a person’s immediate environment. This last aspect distinguishes "benevolence: caring" from "universalism: concern". Similar to the two sides of security, the decisive difference between "benevolence: caring" and "universalism: concern" is to whom the value refers. In case of individuals, family, and friends, it is "benevolence: caring". In contrast, when it is applied to any social group or people in general, it is "universalism: concern". Thus, "benevolence: caring" will most likely not appear often in the text only if speaking of individual experiences.

Benevolence: dependability

Being a reliable and trustworthy member of the in-group.

Personal motivation: Be loyal to close ones, be dependable and trustworthy, especially to close ones (in-group). Be seen as reliable, others should have confidence in you helping close ones.

A key difference between "benevolence" and "universalism" is that "benevolence" is primarily targeted towards close ones, and not towards strangers. In line with "security: personal", close ones are family and friends, not the larger public.

Universalism: concern

Commitment to equality, justice, and protection for all people.

Personal motivation: Protecting the weak and vulnerable, care about equal opportunities, treat everyone justly.

In contrast to the value of "security: societal", protection and caring also goes beyond society boundaries. It is more generic referring to all kinds of people/groups. Moreover, "universalism: concern" focuses more on caring, protecting, promoting well-being, especially of vulnerable people from a motivation of "empathy", "helping" or "a universal justice perspective", whereas "security: societal" focuses on protecting, promoting well-being more from a motivation of "preventing", "averting", "ending" dangers or threats and "preserving" security and stability.

Universalism: nature

Preservation of the natural environment.

Personal motivation: Care about nature for nature's sake, protect the environment against pollution, destruction and other threats.

Universalism: tolerance

Acceptance and understanding of those who are different from oneself.

Personal motivation: Care about peace and harmony, listen to people with other views, understand even those one disagrees with.

The key point here is that it is peace for harmony's sake and not for the protection of the weak, which is covered in "universalism: concern" already.

Each referred value (detected in sub-task 1) can either be mentioned as something that is or should be attained or something that is not attained or constrained (detected in sub-task 2). Attainment would mean that whatever is described in the sentence will help lead to fulfilling the value. For Security (personal or societal), attainment would mean that something is made safer or healthier. In contrast, an event can be stated in a way that thwarts/constrains safety or health.

Data

Data is provided as tab-separated values files with one header line. The data stems from the ValuesML project. In addition to the original files in nine languages we provide a machine-translated version in English. Stay up-to-date and report problems on the task mailing list.

The sentences.tsv file contains one sentence per line: the Text-ID identifies the text that contains the sentence, the Sentence-ID gives the index of the sentence in the text, and Text is the sentence text itself. Toy example with tab-separated columns highlighted: [toy example]

Text-ID	Sentence-ID	Text
A	1	First text first sentence.
A	2	First text second sentence.
A	3	First text third sentence.
B	1	Second text first sentence.
B	2	Second text second sentence.

The labels.tsv file contains one sentence per line: the Text-ID and Sentence-ID as above, and for each of the 19 values two columns—one column value attained with a 1 meaning that the sentences refers to this value and (partially) attains it; and one column value constrained with a 1 meaning that the sentences refers to this value and (partially) constrains it. If both are 0 the sentence does not refer to that value at all. If both are 0.5 the sentence refers to the value but it is unclear whether it (even partially) attains or constrains it. Toy example with tab-separated columns highlighted: [toy example]

Text-ID	Sentence-ID	Self-direction: thought attained	Self-direction: thought constrained	Self-direction: action attained	Self-direction: action constrained	Stimulation attained	Stimulation constrained	Hedonism attained	Hedonism constrained	Achievement attained	Achievement constrained	Power: dominance attained	Power: dominance constrained	Power: resources attained	Power: resources constrained	Face attained	Face constrained	Security: personal attained	Security: personal constrained	Security: societal attained	Security: societal constrained	Tradition attained	Tradition constrained	Conformity: rules attained	Conformity: rules constrained	Conformity: interpersonal attained	Conformity: interpersonal constrained	Humility attained	Humility constrained	Benevolence: caring attained	Benevolence: caring constrained	Benevolence: dependability attained	Benevolence: dependability constrained	Universalism: concern attained	Universalism: concern constrained	Universalism: nature attained	Universalism: nature constrained	Universalism: tolerance attained	Universalism: tolerance constrained
A	1	0	0	0	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	0	0	0	0	0	0	0	0	0	0	0	0	0
A	2	1	0	0	0	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	0	0	0	0	0	0	1	0	0	0	0	0	0	0	0	0	0	0
A	3	0.5	0.5	0	0	0	0	0	0	0	0	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	0	0	0	0	0	1
B	1	0	0	0	0	0	0	0	0	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	0	0	0	0	0	0	0	0	0	0	0	0	0
B	2	0	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1

For sub-task 1 an approach thus has to identify the values for each sentence where at least one is not 0. For sub-task 2 an approach has to identify for those values for a sentence where at least one is 1, whether the 1 is in the attained or constrained column.

Submission

Submit your approach via TIRA. Ask in the Forum if you need help. You need to register your team (in addition to a registration at CLEF) and pick an alias for your team name (submission is anonymous; you can reveal you true team name after final paper acceptance). You submit a Docker image or submit from your Github repository (via automated Docker building). In case of trouble, you can also submit via run file upload (not recommended due to poor reproducibility; rather contact us if you need help with Dockerization). You can submit on the validation dataset to check how submission works or the test dataset. You will not be able to see your results on the test dataset until after the deadline. Datasets are provided multilingual or in machine-translated English (see Data). [forum] [submit]

We recommend to start your approach from one of our example approaches (in Python), which include the code for reading and writing the files and make it easy to later deploy your approach as server or submit and distribute it as Docker image. [random baseline: script, notebook] [bert baseline] [ollama baseline]

Approaches need to produce run files that have the same format as the labels.tsv, but the numbers can be between 0 and 1 and are interpreted as the confidence of the approach (employed for evaluation via ROC-curves): [toy example]

For sub-task 1: For each sentence and value, the sum of the numbers in the attained and constrained columns should be the confidence of your approach in that the sentence references the value. A sum ≥ 0.5 is treated as a positive prediction for purposes of evaluation with precision, recall, and F1-score.
Example: An approach predicts for some sentence, "Face attained" = 0.1 and "Face constrained" = 0.2 ⇒ approach's confidence (sentence resorts to "Face") = 0.1 + 0.2 = 0.3 ⇒ approach's confidence < 0.5 ⇒ the approach predicted the sentence does not refer to "Face".
For sub-task 2: For each sentence and value, the number for attained normalized (i.e., divided) by the sum should be the confidence of your approach in that the reference (partially) attains the value. A normalized number ≥ 0.5 is treated as prediction for attained (and otherwise constrained) for purposes of evaluation with precision, recall, and F1-score. Thus the higher number is treated as the prediction.
Example: An approach predicts for some sentence, "Face attained" = 0.1 and "Face constrained" = 0.2 ⇒ approach's confidence (sentence rather attains than constrains "Face") = 0.1 / (0.1 + 0.2) = 0.33 ⇒ approach's confidence < 0.5 ⇒ the approach predicted the sentence rather constrains than attains "Face".

Note that you submit for both sub-tasks with the same file. If you want to participate only in sub-task 1, always set the number for constrained to 0. If you want to participate only in sub-task 2, the sum for attained and constrained for a value does not matter, only their ratio. If you want to participate in both sub-tasks, avoid submitting the same number for attained or constrained for a value even if your approach is certain that the value is not referenced: if the approach is wrong and the value is actually referenced, it still matters for sub-task 2 which number is the larger one.

Evaluation

For both sub-tasks, the submission system will evaluate runs automatically using F1-score, Precision, Recall, and ROC-curves (for each value and averaged). Runs on the sub-task leaderboards are ranked according to averaged F1-score. Use our evaluator to produce a detailed report (as HTML) for your runs, including ROC curves and tables for identifying the sentences your approach got the most wrong. [evaluator]

Related Work

Johannes Kiesel, Milad Alshomary, Nailia Mirzakhmedova, Maximilian Heinrich, Nicolas Handke, Henning Wachsmuth, and Benno Stein. SemEval-2023 Task 4: ValueEval: Identification of Human Values behind Arguments. In Ritesh Kumar et al., editors, 17th International Workshop on Semantic Evaluation (SemEval 2023), pages 2287-2303, July 2023. Association for Computational Linguistics.
Johannes Kiesel, Milad Alshomary, Nicolas Handke, Xiaoni Cai, Henning Wachsmuth, and Benno Stein. Identifying the Human Values behind Arguments. In Smaranda Muresan, Preslav Nakov, and Aline Villavicencio, editors, 60th Annual Meeting of the Association for Computational Linguistics (ACL 2022), pages 4459-4471, May 2022. Association for Computational Linguistics.
Mario Scharfbillig, Laura Smillie, David Mair, Marta Sienkiewicz, Julian Keimer, Raquel Pinho Dos Santos, Hélder Vinagreiro Alves, Elise Vecchione, and Laurenz Scheunemann. Values and Identities - a Policymaker’s Guide. Technical Report KJ-NA-30800-EN-N, 2021. European Commission’s Joint Research Centre.
Shalom H. Schwartz, Jan Cieciuch, Michele Vecchione, Eldad Davidov, Ronald Fischer, Constanze Beierlein, Alice Ramos, Markku Verkasalo, Jan-Erik Lönnqvist, Kursad Demirutku, Ozlem Dirilen-gumus, and Mark Konty. Refining the Theory of Basic Individual Values. Journal of Personality and Social Psychology, 103(4), pages 663-688, 2012.