Beyond one-preference-fits-all alignment: Multi-objective direct preference optimization Z Zhou, J Liu, C Yang, J Shao, Y Liu, X Yue, W Ouyang, Y Qiao arXiv preprint arXiv:2310.03708, 2023 | 10 | 2023 |
Attacks, defenses and evaluations for llm conversation safety: A survey Z Dong, Z Zhou, C Yang, J Shao, Y Qiao arXiv preprint arXiv:2402.09283, 2024 | 4 | 2024 |
MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues G Bai, J Liu, X Bu, Y He, J Liu, Z Zhou, Z Lin, W Su, T Ge, B Zheng, ... arXiv preprint arXiv:2402.14762, 2024 | 2 | 2024 |
Intent: Interactive tensor transformation synthesis Z Zhou, MT Tang, Q Pan, S Tan, X Wang, T Zhang Proceedings of the 35th Annual ACM Symposium on User Interface Software and …, 2022 | 1 | 2022 |
ConceptMath: A Bilingual Concept-wise Benchmark for Measuring Mathematical Reasoning of Large Language Models Y Wu, J Liu, X Bu, J Liu, Z Zhou, Y Zhang, C Zhang, Z Bai, H Chen, T Ge, ... arXiv preprint arXiv:2402.14660, 2024 | | 2024 |
Emulated Disalignment: Safety Alignment for Large Language Models May Backfire! Z Zhou, J Liu, Z Dong, J Liu, C Yang, W Ouyang, Y Qiao arXiv preprint arXiv:2402.12343, 2024 | | 2024 |