Published April 28, 2025 | Version v1
Model Open

Domain-Specific Classifier for Predicting Webpage Relevance

Creators

  • 1. ROR icon TU Wien

Description

Finetuned base_bge-large-en-v1.5 for use in website topic relevance

The following dataset contains a finetuned version of the base_bge-large-en-v1.5 model, which is tuned for creating embeddings that extract the topic meaning of a website.

Context and methodology

  • This model was fine tuned for the purpose of demonstrating data stewardship.
  • It was created by fine tuning the already exisitng base_bge-large-en-v1.5 model with training samples from websites.

Technical details

This datatset contains the tensor weights and tokenizers needed to run it with the SentenceTransformers Toolkit.

To use this model follow the instructions of the base model: BAAI/bge-large-en-v1.5 · Hugging Face

Files

config.json

Files (640.2 MiB)

Name Size
md5:26108074f7cb2f0d7d05f9d87af9513f
697 Bytes Preview Download
md5:2a0680239cce1c57a260407a3592eba7
639.3 MiB Download
md5:0d21dc8bbd9451f28b3609ed3dcb2375
695 Bytes Preview Download
md5:8d46e7d5827bb296905d199ff3d2042d
694.8 KiB Preview Download
md5:dd226a6c75dd74ecac9c5e0ac5fd11d6
1.2 KiB Preview Download
md5:306687e0ce41f5c83a4222a84708b026
7.4 KiB Download
md5:64800d5d8528ce344256daf115d4965e
226.1 KiB Preview Download