Extract plain text from webpages
Description
Web.txt is a chrome extension that converts webpages into plain text (.txt) files, enabling users to easily extract, download, and manage text content for various purposes such as ChatGPT analysis, documentation, and research.
Features
• Minimum Word Threshold (n-grams): This feature allows users to extract and select only substantial blocks of text, ensuring that smaller, less relevant segments are excluded.
• Download Plain Text (.txt): Save the extracted text content directly to your device.
• Open Plain Text in New Tab: View the extracted text content in a new browser tab for easy access and viewing.
Example Use Case: Analyzing large webpages
The context limit for ChatGPT-4o is 4096 characters, which is around 800 to 1000 words. By uploading a plain text (.txt) file, GPT-4o can batch process the file, handling up to 50,000 characters in a single batch, which is approximately 10,000 words. If a plain text file exceeds this 50k limit, GPT-4o will automatically process it by dividing it into smaller chunks, each within the 50,000-character limit. Web.txt can help you analyze +100,000 word webpages with ChatGPT 4o.
1. Open any online article, forum, blog, wiki, etc. (URL must start with “https://”).
2. Open the Web.txt extension, select the minimum word threshold using the slider, and download the plain text (.txt) file.
3. Upload the file to GPT-4o and ask any prompt. GPT-4o will be able to read and reference the entire plain text (.txt) file.