AI & ML interests

Network-based Credibility Modelling

Recent Activity

ekmpa  updated a Space about 1 hour ago
credi-net/CrediNet
ekmpa  updated a Space about 3 hours ago
credi-net/README
TheUnknot  updated a dataset 1 day ago
credi-net/CrediPred
View all activity

Organization Card

CrediNet

credinet

CrediNet is set of tools that use graph machine learning and computational methods for credibility modelling on the web. We develop billion-scale data webgraphs and use them to assess credibility levels of websites, which can be used downstream to augment Retrieval-Augmented Generation robustness and fact-checking. This involves large-scale web scraping and text processing, and developing model architectures to interpret the different types of signals we can find on the web (including structural, temporal and linguistic cues).

See also: our Github codebases

PyPI Downloads Paper Hugging Face License: CC BY 4.0


Projects

  • CrediBench: benchmark of billion-scale temporal webgraphs on a monthly granularity, sourced from Common Crawl. For the corresponding graph construction pipeline refer to CrediGraph - GitHub.
  • CrediPred: inferred scores from our developed model (for more details on the model architecture, refer to CrediPred - GitHub).
  • DomainRel: dataset of 600k+ domains labelled as reliable or not (0-1) spanning four domains: phishing, malware, misinformation and general knowledge
  • CrediText: text embeddings extracted from scraped web content. Find the corresponding scraping and embedding pipelines on CrediText - GitHub.
  • CrediNet: API set-up to query CrediPred scores easily on the client side (for more details on the API set up and examples usages, refer to CrediNet - GitHub).
suite

models 0

None public yet