Using AI to improve Community Notes #163

JayThibs · 2023-11-14T20:42:13Z

In order to improve the speed at which important community notes get added and to help community noters write better notes, I'm curious if people have put some effort into using a mic of AI (language models) and more simple methods. I'd like to help with this if we can make it work economically.

For example, you could have a scaffolding approach that looks for specific words, which then feeds into an embedding for semantic similarity to contentious issues, and then finally into an LLM that ranks how important the tweet is to have a community and some additional context (through a web search and internal knowledge within the LLM) to help the community noter. I think there's a way to make this economically viable for companies.

Yes, companies, I want Community Notes to expand beyond X. Let's figure out how to connect it to YouTube. Why haven't other social media websites picked it up yet? If they care about truth, this would be a considerable step forward beyond. Notes like “this video is funded by x nation” or “this video talks about health info; go here to learn more” messages are simply not good enough. We need to improve the state of truth-seeking on the internet.

Not just that, as an AI Safety researcher, this is particularly important to me. Don't forget that we train language models on the internet! The more truthful your dataset is, the more truthful the models will be! Let's revamp the internet for truthfulness, and we'll subsequently improve truthfulness in our AI systems!!

JayThibs · 2023-11-14T22:51:59Z

I used GPT-4 to come up with a little estimate of the cost:

To estimate the cost of using GPT-3.5-turbo to verify whether a tweet contains misinformation, we need to consider both input and output token costs and the average token count per tweet analysis.

Average Token Count per Tweet Analysis:

A tweet can be up to 280 characters long. Considering that an average English word is around 4.5 characters plus a space, this translates to about 280 / 5.5 ≈ 51 words per tweet.
Tokens are not equivalent to words; a token can be a word, a part of a word, or punctuation. On average, we can estimate around 1.5 tokens per word, considering that longer words might be split into multiple tokens and punctuation also counts as tokens.
Therefore, each tweet can be estimated to involve about 51 * 1.5 ≈ 77 tokens.

Input Cost Calculation:
If you analyze 10,000 tweets per day, and each tweet is approximately 77 tokens, the total daily input tokens are 10,000 * 77 = 770,000 tokens.
The cost for input usage is $0.0030 per 1,000 tokens. So, the daily input cost is 770,000 / 1,000 * $0.0030 = $2.31.

Output Token Estimation:
The output for each tweet verification might vary, but let's assume an average of around 50 tokens per response (this is a rough estimate, as the model might provide a brief or detailed analysis depending on the complexity of the tweet).
For 10,000 tweets, the total output tokens would be 10,000 * 50 = 500,000 tokens.

Output Cost Calculation:
The cost for output usage is $0.0060 per 1,000 tokens. So, the daily output cost is 500,000 / 1,000 * $0.0060 = $3.00.

Total Daily Cost:
Adding both input and output costs, the total daily cost would be $2.31 (input) + $3.00 (output) = $5.31.

Monthly Cost Estimation:
If this operation runs every day for a month (30 days), the monthly cost would be $5.31 * 30 = $159.30 (per month).
This is a rough estimate based on average values for token length and response size. The actual cost may vary depending on the exact length of each tweet and the verbosity of the model's responses.

This seems like a reasonable cost to aim for. For big social media companies, this is nothing (even if we increase it to $1000/month). I picked 10k tweets per day because we could use other cheaper methods to filter down to about that many tweets per day. For example, only throw in tweets that have over 100 likes (or more), rank how important the tweet should be sent to 3.5-turbo or 4 based on a score that takes into account a list of sensitive words and embedding scores. I'm sure there's additional stuff we could add here. Clip anything with a low score + max out at x number of tweets per day.

You could add gpt-4-vision for tweets that contain images or gpt-4 for the y number of most-liked tweets.

Also, it would be good to make it even easier for people to write great community notes. I'm sure GPT-4 + search and other models can help with this.

(Of course, the companies could probably save even more money if it just had a GPU with a fine-tuned model and maybe fine-tuned embeddings.)

JayThibs · 2023-11-14T23:54:35Z

Ok, some additional things to try:

MuMiN: it has a dataset and leaderboard for identifying misinformation tweets (it even has a multimodal part). We could train a model on this. It has a tutorial on how to use it. The MuMiN dataset is a challenging misinformation benchmark for automatic misinformation detection models. The dataset is structured as a heterogeneous graph and features 21,565,018 tweets and 1,986,354 users, belonging to 26,048 Twitter threads, discussing 12,914 fact-checked claims from 115 fact-checking organisations in 41 different languages, spanning a decade.
Could probably find more relevant stuff here: Papers with Code - Misinformation
Misinfo Baselines and related paper
Repository for fake news detection datasets

JayThibs · 2023-11-15T03:54:38Z

Alright, so I started working a repo for this.

luckybear97 · 2023-11-15T05:26:25Z

I don't think an AI is needed with Community Notes. Priority notes already implemented on X/Twitter backend and adding another layer of AI verification would slow it down dramatically.

Furthermore, LLMs are hardly free from biases, in which the data was selectively "handpicked" from their "AI Safety" guidelines which is against of what X/Twitter goals are. In addition, using GPT (OpenAI) while Grok is part of X, would totally does not make sense.

JayThibs · 2023-11-15T13:31:38Z

@luckybear97

I don't think an AI is needed with Community Notes. Priority notes already implemented on X/Twitter backend and adding another layer of AI verification would slow it down dramatically.

I'm not familiar with the codebase so don't really know how the algorithm works. My guess is that AI could improve the ranking if it is used in addition to it? I'd be surprised if the algorithm's efficiency can't be improved? If not, there's also the community notes assistant that could help note writers.

Furthermore, LLMs are hardly free from biases, in which the data was selectively "handpicked" from their "AI Safety" guidelines which is against of what X/Twitter goals are.

You can use custom LLMs (like one of the fine-tuned open source models) or optimize your system prompt / instructions accordingly to fit your use-case. I don't think this is really an issue.

In addition, using GPT (OpenAI) while Grok is part of X, would totally does not make sense.

Grok has no API. They could use an internal Grok model if they like, I wouldn't care. But OpenAI/Anthropic have APIs so I started with them.

But honestly, Grok's personality is not very well-suited to help with this IMO.

armchairancap · 2023-11-26T01:50:23Z

I don't think an AI is needed with Community Notes.

Yes. It's a form of mission creep and it would add complexity and cost without adding any value, especially considering that AI is heavily regulated and each jurisdiction may have its own official or approved AI that's biased to promote the official lies.

TheApproach · 2023-12-09T16:54:45Z

IMO, Community Notes remaining as a source of genuine human feedback, and minimizing generated feedback is ideal.
At least until fully cognizant & sentient AI people show up in the community, I suppose.

These are the notes of the community, and also an especially valuable public discourse aspect that many AIs will undoubtedly look at. I think getting AI involved here would lend to creating nonsense artifacts. Similar to toying with control samples in an experiment, making results less useful.

xcsf6 · 2023-12-13T06:48:54Z

I don't think an AI is needed with Community Notes. Priority notes already implemented on X/Twitter backend and adding another layer of AI verification would slow it down dramatically.

Furthermore, LLMs are hardly free from biases, in which the data was selectively "handpicked" from their "AI Safety" guidelines which is against of what X/Twitter goals are. In addition, using GPT (OpenAI) while Grok is part of X, would totally does not make sense.

Do you claim that RLHF or other alignment data with their "AI Safety" guidelines to remove bias increase the bias and X/Twitter goals are reflecting "raw data's biases" to community note?.

armchairancap · 2023-12-15T07:28:42Z

Do you claim that RLHF or other alignment data with their "AI Safety" guidelines to remove bias increase the bias and X/Twitter goals are reflecting "raw data's biases" to community note?.

Yes, I do.

You completely ignore the self-evident fact that governments are often the biggest and most systematic spreaders of dis- and mis-information and AI "laws" that X would have to follow would be different in every jurisdiction, on top of "safe" AI models being wrong i.e. rigged by the state.

In your view North Korea's state-approved AI should be able to apply their "AI Safety guidelines" to my Community Note exposing their lies.

In addition to being impractical and impossible to implement, it is preposterous to think that Community Notes should be subject to various types of state-mandated AI censorship.

Why even bother having humans in the loop?

tactipus · 2023-12-15T12:37:55Z

lol

xcsf6 · 2024-01-21T10:02:42Z

@armchairancap @luckybear97

Please don't argue with "straw man."

I never talk about governments at the previous comment.

You completely ignore the self-evident fact that governments are often the biggest and most systematic spreaders of dis- and mis-information and AI "laws" that X would have to follow would be different in every jurisdiction, on top of "safe" AI models being wrong i.e. rigged by the state.

In your view North Korea's state-approved AI should be able to apply their "AI Safety guidelines" to my Community Note exposing their lies.

In addition to being impractical and impossible to implement, it is preposterous to think that Community Notes should be subject to various types of state-mandated AI censorship.

At least to say, free speech clause of first amendment to US constitution protects corporations like X, Meta, Google etc against government law enforcements.

Thus, LLMs created by X in U.S. would be interpreted as X's corporation speech.

Why even bother having humans in the loop?

Which subset of humans are in the loop?

Because X could select community note contributor in arbitrary and closed manners, X could induce "selection biases". However, this is still considered as "X's free speech rights to be biased."

If X's goal is to create truly bias-free contributors, then it just needed to use identification numbers like phone numbers or bio-metrics, but X uses "violation history in X's moderation rules".

Furthermore, LLMs are hardly free from biases, in which the data was selectively "handpicked" from their "AI Safety" guidelines which is against of what X/Twitter goals are

I don't think that current contributor selection mechanism is free from biases mentioned as above.

xcsf6 · 2024-01-21T10:12:34Z

@armchairancap

BTW, if you are an anarcho-capitalist, you have NO free speech rights against Microsoft at this GitHub platform because of MS's absolute private ownership of the computing clusters.

luckybear97 · 2024-01-21T10:42:17Z

@armchairancap @luckybear97

Please don't argue with "straw man."

I never talk about governments at the previous comment.

You completely ignore the self-evident fact that governments are often the biggest and most systematic spreaders of dis- and mis-information and AI "laws" that X would have to follow would be different in every jurisdiction, on top of "safe" AI models being wrong i.e. rigged by the state.

In your view North Korea's state-approved AI should be able to apply their "AI Safety guidelines" to my Community Note exposing their lies.

In addition to being impractical and impossible to implement, it is preposterous to think that Community Notes should be subject to various types of state-mandated AI censorship.

At least to say, free speech cause of first amendment to US constitution protects corporations like X, Meta, Google etc against government law enforcements.

Thus, LLMs created by X in U.S. would be interpreted as X's corporation speech.

Why even bother having humans in the loop?

Which subset of humans are in the loop?

Because X could select community note contributor in arbitrary and closed manners, X could induce "selection biases". However, this is still considered as "X's free speech rights to be biased."

If X's goal is to create truly bias-free contributors, then it just needed to use identification numbers like phone numbers or bio-metrics, but X uses "violation history in X's moderation rules".

Furthermore, LLMs are hardly free from biases, in which the data was selectively "handpicked" from their "AI Safety" guidelines which is against of what X/Twitter goals are

Perhaps you're confused,I never claimed "X" is free from bias it's just that "free speech" is their goals and using LLM to selectively rate or process notes would terribly destroy it.

I don't think that current contributor selection mechanism is free from biases mentioned as above.

I don't know what's their criteria to be a poster and never was I claimed it's free from biases. They are a private company for "profit" in which they have every rights to decide whatever rules they want although I do agree it's best to be transparent about it

xcsf6 · 2024-01-21T10:53:52Z

Perhaps you're confused,I never claimed "X" is free from bias it's just that "free speech" is their goals and using LLM to selectively rate or process notes would terribly destroy it.

I never address about what your claim is, but @armchairancap said that "You completely ignore the self-evident fact..." which I never said about.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using AI to improve Community Notes #163

Using AI to improve Community Notes #163

JayThibs commented Nov 14, 2023

JayThibs commented Nov 14, 2023 •

edited

Loading

JayThibs commented Nov 14, 2023

JayThibs commented Nov 15, 2023 •

edited

Loading

luckybear97 commented Nov 15, 2023 •

edited

Loading

JayThibs commented Nov 15, 2023

armchairancap commented Nov 26, 2023

TheApproach commented Dec 9, 2023 •

edited

Loading

xcsf6 commented Dec 13, 2023

armchairancap commented Dec 15, 2023

tactipus commented Dec 15, 2023

xcsf6 commented Jan 21, 2024 •

edited

Loading

xcsf6 commented Jan 21, 2024

luckybear97 commented Jan 21, 2024

xcsf6 commented Jan 21, 2024

Using AI to improve Community Notes #163

Using AI to improve Community Notes #163

Comments

JayThibs commented Nov 14, 2023

JayThibs commented Nov 14, 2023 • edited Loading

JayThibs commented Nov 14, 2023

JayThibs commented Nov 15, 2023 • edited Loading

luckybear97 commented Nov 15, 2023 • edited Loading

JayThibs commented Nov 15, 2023

armchairancap commented Nov 26, 2023

TheApproach commented Dec 9, 2023 • edited Loading

xcsf6 commented Dec 13, 2023

armchairancap commented Dec 15, 2023

tactipus commented Dec 15, 2023

xcsf6 commented Jan 21, 2024 • edited Loading

xcsf6 commented Jan 21, 2024

luckybear97 commented Jan 21, 2024

xcsf6 commented Jan 21, 2024

JayThibs commented Nov 14, 2023 •

edited

Loading

JayThibs commented Nov 15, 2023 •

edited

Loading

luckybear97 commented Nov 15, 2023 •

edited

Loading

TheApproach commented Dec 9, 2023 •

edited

Loading

xcsf6 commented Jan 21, 2024 •

edited

Loading