目次
Using the Sentence Transformer to I’m trying to use Sentence Transformer to infer causal relationships between documents.
If we can do this, we can extract the cause and symptoms of the incident from the report.
So, I wondered if NLI could be used for feature learning to extract causal information. I thought.
What is NLI?
Inference of the relationship between two sentences
- Forward
- Inverse
- Unrelated
The three relations are.
Apply to causal relationships
If we apply the three relationships of NLI to causality, the following patterns are possible.
- Contradiction: cause, effect pairs
- Entailment: cause and effect pairs
- Neutral: a pair of cause or effect and the other
You can create data for at least x3 number of case reports.
Script Example
What we want to do is
- Create a pair of contradiction and entailment by extracting about a certain case from the data frame.
- Extract two rows at random from the data frame and create a pair of Create a pair of neutrals
You can use ``py import random
traindev_samples = []. data_size = 400 train_batch_size = 32 (cause_text_list, effect_text_list, other_texts_list) = extract_cause_effect_other_text(df)
contradiction
for i in range(max(df[“doc_no”])): effect_text = effect_text_list[i].replace(" “,”") cause_text = cause_text_list[i]. if len(effect_text) == 0 or len(cause_text) == 0: continue else: traindev_samples.append(InputExample(texts=[effect_text, cause_text], label=label2int[“contradiction”]))
entailment
for i in range(max(df[“doc_no”])): effect_text = effect_text_list[i].replace(" “,”") cause_text = cause_text_list[i]. if len(effect_text) == 0 or len(cause_text) == 0: continue else: traindev_samples.append(InputExample(texts=[cause_text, effect_text], label=label2int[“entailment”]))
neutral
random.seed(45) choise_patterns = {“cause2other”:0, “effect2other”:1, “other2cause”:2, “other2effect”:3} for i in range(data_size): ind_0 = random.randint(0, max(df[“doc_no”])-1) ind_1 = random.randint(0, max(df[“doc_no”])-1) if ind_0 == ind_1: continue else: choice = random.randint(0, 3) other_texts = other_texts_list[ind_1]. if len(other_texts) == 0: continue other_text = random.sample(other_texts,k=1)[0] if choice == 0: traindev_samples.append(InputExample(texts=[cause_text_list[ind_0], other_text], label=label2int[“neutral”])) elif choice == 1: traindev_samples.append(InputExample(texts=[effect_text_list[ind_0], other_text], label=label2int[“neutral”])) elif choice == 2: traindev_samples.append(InputExample(texts=[other_text, cause_text_list[ind_0]], label=label2int[“neutral”])) elif choice == 3: traindev_samples.append(InputExample(texts=[other_text, effect_text_list[ind_0]], label=label2int[“neutral”]))
## Set the loss function.
We chose softmax for the loss function, interpreting it as a prediction problem with labels of 0, 1, and 2.
```py
SoftmaxLoss(model=model, sentence_embedding_dimension=model.get_sentence_embedding_dimension(), num_labels=len( label2int))
Training
The specific way to train the model is
How to train Japanese model with Sentence transformer to get distributed representation of sentences
See
Question.
- What about the epoch of learning?
- Is that a hyperparameter?
- How does the number of data affect the learning?
See also
- Why is fasttext so fast?
- On the use of distributed representations bagging for class classification and generalization performance
- How to train a Japanese model with Sentence transformer to get a distributed representation of a sentence
- Using BART (sentence summary model) with hugging face
- Procedure for obtaining a distributed representation of a Japanese sentence using a trained Universal Sentence Encoder