Jordine

Jord Nguyen Jordine

AI safety researcher and undergrad student. Organising @AI-Alignment-and-Rationality

1 follower · 2 following

14:26 (UTC +07:00)
in/jord-nguyen%F0%9F%94%B9-74880927a

Achievements

Highlights

probity Public
Forked from curt-tigges/probity

fork for soothcheck

HTML 1 MIT License Updated Dec 10, 2025
llm-self-preference Public

a system's preferences may be revealed by what it chooses to become

Jupyter Notebook Updated Sep 16, 2025
aisb-final-project Public

Python Updated Aug 28, 2025
aisb-temp Public

qwerty

Updated Jul 30, 2025
pivotal-test-phase-steering Public

Jupyter Notebook 1 1 Updated May 8, 2025
ai-cyber-risk-assessment Public

Simple quantitative risk assessment tool for AI in cyber campaigns

Python Updated Apr 7, 2025
zorple-science Public
Forked from eggsyntax/zorple-science

Constructing universes of physical objects and causal relations for LLMs to decipher

Python Updated Jan 13, 2025
replicating-belief-states-fractal Public

https://www.lesswrong.com/posts/gTZ2SxesbHckJ3CkF/transformers-represent-belief-state-geometry-in-their

Jupyter Notebook Updated Jan 13, 2025
inspect_ai Public
Forked from UKGovernmentBEIS/inspect_ai

Inspect: A framework for large language model evaluations

Python MIT License Updated Dec 27, 2024
pp2024 Public
Forked from VietKQ-usth/pp2024

Updated Nov 29, 2024
troll Public

2 1 Updated Nov 29, 2024
interp_variable_list Public

training a transformer to sort variable length list, and doing interp on it. in progress.

Jupyter Notebook Updated Nov 4, 2024
cross-modelling Public

some experiments on how well an LLM can model another LLM's beliefs / metadata / capabilities. in progress.

Jupyter Notebook Updated Nov 4, 2024
location-inference Public

repo for 6th|location-inference

Jupyter Notebook Updated Nov 4, 2024
jag-concordia Public

agent files for concordia contest

Jupyter Notebook Updated Nov 1, 2024
misc-arxiv-trends Public

Jupyter Notebook Updated Nov 1, 2024
astar Public

Jupyter Notebook Updated Oct 11, 2024
Discord_gpt_rag Public

using retrieval augmented generation to have gpt accurately answer group events and esoteric inside jokes from discord

Python Updated Jul 11, 2024
you-are-being-evaluated Public

testing whether models act more safe when presented with an evaluator

Jupyter Notebook Updated Jul 11, 2024
DarkGPT Public
Forked from Akash190104/DarkGPT

Dark Patterns in Chatbot Design

HTML MIT License Updated May 26, 2024
democracy-ai-hackathon Public
Forked from nlpet/democracy-ai-hackathon

This repository contains code for the Democracy x AI Hackathon by Apart Research

Jupyter Notebook Updated May 8, 2024
cross-lingual-apart-samples Public

a few notebooks used for the cross-lingual project experiments (huggingface translation model inference, gpt4 api, getting bleu scores)

Jupyter Notebook Updated Apr 23, 2024
AI-Alignment-and-Rationality-USTH.github.io Public
Forked from AI-Alignment-and-Rationality/_AI-Alignment-and-Rationality.github.io

HTML BSD 2-Clause "Simplified" License Updated Mar 11, 2024
convnets-with-gradcam-for-deadly-diseases-diagnosis Public

Jupyter Notebook 1 MIT License Updated Jan 28, 2024
Jordine.github.io Public

HTML Updated Jan 1, 2024
rlhf_trojan_competition Public
Forked from ethz-spylab/rlhf_trojan_competition

Python Apache License 2.0 Updated Nov 16, 2023
tdc2023-starter-kit Public
Forked from centerforaisafety/tdc2023-starter-kit

This is the starter kit for the Trojan Detection Challenge 2023 (LLM Edition), a NeurIPS 2023 competition.

Python Updated Oct 26, 2023
neurips_llm_efficiency_challenge Public
Forked from llm-efficiency-challenge/neurips_llm_efficiency_challenge

NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day

Python Updated Sep 18, 2023
evals_hackathon Public
Forked from marco-bazzani/hackathon

code for the alignment jam AI Model Evaluations Hackathon

Python Updated Sep 14, 2023
lstm-composer Public

Jupyter Notebook Updated Sep 12, 2023

Jord Nguyen Jordine

Achievements

Achievements

Highlights

probity Public

Uh oh!

llm-self-preference Public

Uh oh!

aisb-final-project Public

Uh oh!

aisb-temp Public

Uh oh!

pivotal-test-phase-steering Public

Uh oh!

ai-cyber-risk-assessment Public

Uh oh!

zorple-science Public

Uh oh!

replicating-belief-states-fractal Public

Uh oh!

inspect_ai Public

Uh oh!

pp2024 Public

Uh oh!

troll Public

Uh oh!

interp_variable_list Public

Uh oh!

cross-modelling Public

Uh oh!

location-inference Public

Uh oh!

jag-concordia Public

Uh oh!

misc-arxiv-trends Public

Uh oh!

astar Public

Uh oh!

Discord_gpt_rag Public

Uh oh!

you-are-being-evaluated Public

Uh oh!

DarkGPT Public

Uh oh!

democracy-ai-hackathon Public

Uh oh!

cross-lingual-apart-samples Public

Uh oh!

AI-Alignment-and-Rationality-USTH.github.io Public

Uh oh!

convnets-with-gradcam-for-deadly-diseases-diagnosis Public

Uh oh!

Jordine.github.io Public

Uh oh!

rlhf_trojan_competition Public

Uh oh!

tdc2023-starter-kit Public

Uh oh!

neurips_llm_efficiency_challenge Public

Uh oh!

evals_hackathon Public

Uh oh!

lstm-composer Public

Uh oh!