2024
- SIGCOMM’24ConfMask: Enabling Privacy-Preserving Configuration Sharing via AnonymizationYuejie Wang, Qiutong Men, Yao Xiao, and 2 more authorsIn Proceedings of the ACM SIGCOMM 2024 Conference, Sydney, NSW, Australia, 2024
Real-world network configurations play a critical role in network management and research tasks. While valuable, data holders of ten hesitate to share them due to business and privacy concerns. Existing methods are deficient in concealing the implicit information that can be inferred from configurations, such as topology and routing paths. To address this, we present ConfMask, a novel framework designed to systematically anonymize network topology and routing paths in configurations. Our approach tackles key privacy, utility, and scalability challenges, which arise from the strong dependency between different datasets and complex routing protocols. Our anonymization algorithm is scalable to large networks and effectively mitigates de-anonymization risk. Moreover, it maintains essential network properties such as reachability, waypointing and multi-path consistency, making it suitable for a wide range of downstream tasks. Compared to existing dataplane anonymization algorithm (i.e., NetHide), ConfMask reduces 75% specification differences between the original and the anonymized networks.
- DASFAA’24LawLLM: Intelligent Legal System with Legal Reasoning and Verifiable RetrievalShengbin Yue, Shujun Liu, Yuxuan Zhou, and 9 more authorsIn The 29th International Conference on Database Systems for Advanced Applications (DASFAA’24), 2024
We propose LawLLM, an LLM-powered intelligent legal system featuring on (1) Versatile Services: LawLLM provides a versatile diverse range of services through its multi-task capabilities; (2) Legal Reasoning: It is fine-tuned on supervised instruction data curated with legal syllogism prompting, enabling LawLLM to develop stronger legal reasoning capabilities based on clear judicial logics; (3) Verifiable Retrieval: with verifiable Labels, LawLLM can first distinguish relevant external knowledge, then incorporate and finally validate it, enhancing the quality and actuality of model output. A comprehensive legal benchmark, Law-Eval, is further constructed to evaluate intelligent legal systems from both objective and subjective dimensions. Experiments demonstrate the effectiveness of our system in serving various users across diverse legal scenarios.
- B.Sc. ThesisAnalyzing the Critical Behavior of Bernoulli Percolation in Z^3 by Simulating the Invasion Percolation ProcessYao Xiao2024
The invasion percolation process is widely known related to the standard Bernoulli bond percolation model. It is known that the infinite cluster density P_∞(p_c) = 0 at critical point for Bernoulli bond percolation in Z^2, but this conclusion is not yet proven to extend to higher dimensions. We note that it is easy to show the equivalence between P_∞(p_c) = 0 in Z^d and G_Z^d(0,x) \to 0 as |x| \to ∞for the invasion percolation process, where G_Z^d(0,x) is the probability that x is invaded by an invasion percolation process starting from the origin. In this paper, we will then show by simulation the dominance of G_Z^3(0,x) by G_Z^2(0,x) for the same |x|, based on which we show that P_∞(p_c) = 0 in Z^3. Finally we will numerically estimate the fractal dimension of invasion percolation cluster in Z^3.
2023
- B.Sc. CapstoneOptimizing the Serving System for Large Language Model InferenceYao Xiao2023
Large Language Models (LLMs) such as ChatGPT have been progressively revolutionizing natural language processing tasks. Since most LLM applications such as chatbots and question answering systems involve real-time interactions, high throughput and low latency are critical for LLM inference systems. However, existing LLM inference systems stuggle because the batch sizes are limited by the key-value cache (KV cache) memory, and bubbles exist in the inference pipeline due to the Attention mechanism, leading to in efficiency in inference. In this paper, I present FluidInfer to address these challenges. FluidInfer abandons the term batch size and instead arbitrarily concatenates or splits requests in non-Attention layers to achieve much higher throughput. It further packs requests dynamically in the Attention layer to reduce pipeline bubbles, while overriding the limitation of KV cache by swapping between GPU and host memory. Simulation results show that FluidInfer outperforms state-of-the-art LLM serving systems by 2x to 4x improvement in throughput, and achieves approximately 3x lower latency bound.
- arXivEfficiently Visualizing Large GraphsXinyu Li, Yao Xiao, and Yuchen Zhou2023
Most existing graph visualization methods based on dimension reduction are limited to relatively small graphs due to performance issues. In this work, we propose a novel dimension reduction method for graph visualization, called t-Distributed Stochastic Graph Neighbor Embedding (t-SGNE). t-SGNE is specifically designed to visualize cluster structures in the graph. As a variant of the standard t-SNE method, t-SGNE avoids the time-consuming computations of pairwise similarity. Instead, it uses the neighbor structures of the graph to reduce the time complexity from quadratic to linear, thus supporting larger graphs. In addition, to suit t-SGNE, we combined Laplacian Eigenmaps with the shortest path algorithm in graphs to form the graph embedding algorithm ShortestPath Laplacian Eigenmaps Embedding (SPLEE). Performing SPLEE to obtain a high-dimensional embedding of the large-scale graph and then using t-SGNE to reduce its dimension for visualization, we are able to visualize graphs with up to 300K nodes and 1M edges within 5 minutes and achieve approximately 10% improvement in visualization quality. Codes and data are available here.