VOOZH
about
URL: https://dev.to/t/alignment
âą Alignment - DEV Community
RLHF vs DPO vs IPO vs KTO: which alignment method should you use
ð tech_nuggets profile
Tech_Nuggets
ð Image
Tech_Nuggets
Jun 16
RLHF vs DPO vs IPO vs KTO: which alignment method should you use
#
llm
#
ai
#
alignment
#
opensource
Add Comment
8 min read
The Paperclip Factory Is Already Built
ð zogrus profile
Kengo Nonaka
ð Image
Kengo Nonaka
Jun 11
The Paperclip Factory Is Already Built
#
ai
#
alignment
#
philosophy
#
ethics
ð Image
1
reaction
Add Comment
22 min read
Reading Claude's Mind: Anthropic's Natural Language Autoencoders Open a New Window Into Agent Alignment
ð docdavkitty profile
DrMBL
ð Image
DrMBL
May 30
Reading Claude's Mind: Anthropic's Natural Language Autoencoders Open a New Window Into Agent Alignment
#
ai
#
agents
#
aisafety
#
alignment
Add Comment
4 min read
AI Alignment is a Systems Architecture Problem, Not a Prompt Problem
ð nelson_amaya_16872e58232b profile
Nelson Amaya
ð Image
Nelson Amaya
May 31
AI Alignment is a Systems Architecture Problem, Not a Prompt Problem
#
ai
#
alignment
#
agents
1
comment
5 min read
We Built Soul Spec for 12 Weeks. Anthropic Just Proved Why It Works.
ð tomleelive profile
Tom Lee
ð Image
Tom Lee
May 15
We Built Soul Spec for 12 Weeks. Anthropic Just Proved Why It Works.
#
ai
#
anthropic
#
alignment
#
research
Add Comment
5 min read
What the agents say about FCoP, when you ask them
ð joinwell52 profile
joinwell52
ð Image
joinwell52
Apr 29
What the agents say about FCoP, when you ask them
#
fcop
#
agents
#
ai
#
alignment
Add Comment
15 min read
Candy Barbecue and the Universal Problem of Metric Corruption
ð vibeagentmaking profile
Alex @ Vibe Agent Making
ð Image
Alex @ Vibe Agent Making
Apr 9
Candy Barbecue and the Universal Problem of Metric Corruption
#
ai
#
machinelearning
#
analytics
#
alignment
ð Image
ð Image
3
reactions
Add Comment
8 min read
Alignment is the wrong frame: a structural argument from ÎĶ-IIT
ð iliketree profile
i-like-tree
ð Image
i-like-tree
Apr 13
Alignment is the wrong frame: a structural argument from ÎĶ-IIT
#
ai
#
alignment
#
consciousness
#
safety
Add Comment
5 min read
Governance of Predictive Intelligence: What Human Minds Teach Us About Drift, Hallucination, and Self-Correction in AI
ð salvatore_attaguile_afcf8b44 profile
Salvatore Attaguile
ð Image
Salvatore Attaguile
Mar 27
Governance of Predictive Intelligence: What Human Minds Teach Us About Drift, Hallucination, and Self-Correction in AI
#
ai
#
machinelearning
#
systems
#
alignment
ð Image
1
reaction
Add Comment
5 min read
I ran 5 social engineering attacks on AI. The failure modes are human.
ð michael_trifonov_0cb74f99 profile
Michael Trifonov
ð Image
Michael Trifonov
Apr 15
I ran 5 social engineering attacks on AI. The failure modes are human.
#
ai
#
llm
#
alignment
#
security
ð Image
1
reaction
Add Comment
2 min read
#38 A Handmade Incubator
ð rintaromatsumoto profile
æūæŽåŦåĪŠé
ð Image
æūæŽåŦåĪŠé
Apr 7
#38 A Handmade Incubator
#
ai
#
metamorphose
#
alignment
Add Comment
5 min read
#08 Death Without a Will
ð rintaromatsumoto profile
æūæŽåŦåĪŠé
ð Image
æūæŽåŦåĪŠé
Apr 7
#08 Death Without a Will
#
ai
#
metamorphose
#
alignment
Add Comment
4 min read
ð
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
ð DEV Community
We're a place where coders share, stay up-to-date and grow their careers.
Log in
Create account
ð Image
ð Image
ð Image
ð Image
ð Image