F1
F1MicroMultipleFieldsMetric
Bases: MetricCollection
Source code in src/kibad_llm/metrics/f1.py
72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 | |
__init__(fields, format_as_markdown=True, sort_fields=False, **kwargs)
Computes F1MicroSingleFieldMetric for multiple fields at once as well as micro (ALL) and macro (AVG) over all fields.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
fields
|
list[str]
|
List of fields to compute F1MicroSingleFieldMetric for. |
required |
format_as_markdown
|
bool
|
Whether to format the result as a markdown table. Defaults to True. |
True
|
**kwargs
|
Additional keyword arguments for F1MicroSingleFieldMetric, e.g., ignore_subfields. |
{}
|
Source code in src/kibad_llm/metrics/f1.py
74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 | |
F1MicroSingleFieldMetric
Bases: MetricWithPrepareEntryAsSet
Computes micro averaged precision, recall, and F1 score for single- and multi-label classification tasks.
The metric operates on sets and allows for simple preprocessing, see _prepare_entry for details.
WARNING: !Since the metric operates on sets, this can obfuscate if the LLM produces duplicate labels !in multi-label settings. E.g., prediction = ["A", "A", "B"] and reference = ["A", "B"] will !be treated as perfect prediction with tp=2, fp=0, fn=0 even though the prediction contains a !duplicate label "A".
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
**kwargs
|
Keyword arguments for entry-to-set preparation. See
|
{}
|
Source code in src/kibad_llm/metrics/f1.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 | |
calculate_scores(state)
staticmethod
Calculates precision, recall and f1 from true positives, false positives and false negatives.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
state
|
dict[str, int]
|
dictionary with keys "tp", "fp", "fn" |
required |
returns: dictionary with precision, recall and f1
Source code in src/kibad_llm/metrics/f1.py
47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 | |
reset()
Resets all values of the internal state to zero
Source code in src/kibad_llm/metrics/f1.py
32 33 34 | |