Luuu

@luuu

80 Following

7 Followers

Luuu pfp

#DailyChallenge Day21(Final) Ontology - A structured framework that defines concepts, relationships, and categories within a domain. - Serves as a comprehensive framework that maps an organization’s data assets—such as datasets and models—to real-world entities and processes. - It helps AI understand context and meaning in data, improving reasoning and decision-making. - Example: A medical ontology organizes diseases, symptoms, and treatments into a structured format AI can use for diagnosis.

0 reply

0 recast

1 reaction

Luuu pfp

Base Seoul Feb meetup!

0 reply

0 recast

1 reaction

Luuu pfp

#dailychallenge day 20 AGI(Artificial General Inteligence) - Definition: An AI system with human-like cognitive abilities, capable of understanding, learning, and applying knowledge across various tasks without needing task-specific programming. - AGI still remains as a theoretical definition, so the detailed definition changes as the speaker changes. However, it refers to a next level of narrow AI - Current LLMs are not AGI. - The next (theoretical) level of AGI is ASI(Artificial Super Inteligence). - Definition of ASI: an AI that surpasses human intelligence in every aspect, including creativity, problem-solving, and emotional intelligence

0 reply

0 recast

1 reaction

Luuu pfp

#dailychallenge day19 PondAI - A sleeper AI infra project - Crypto AI layer by running incentive-driven competitions for onchain prediction models that empower smarter DeFAI, security, and trading agents - The best models receive incentives, and model developers retain ownership, provided they are integrated with real-time data and inference infrastructure. Gaianet - A decentralized computing infrastructure that enables everyone to create, deploy, scale, and monetize their own AI agents that reflect their styles, values, knowledge, and expertise. - Supports AI-powered dApps & smart contracts - Open-source framework

0 reply

0 recast

3 reactions

Luuu pfp

#dailychallenge day17 Web3 AI agents *Virtuals Protocol Breakthroughs: • Tokenizes AI agents for monetization and governance. • AI agents execute smart contracts and manage assets. • No-code AI creation with modular interaction. Limitations: • Limited adoption and market maturity. • Security risks with autonomous AI transactions. • Ethical concerns over AI ownership. --- *ai16z (ElizaOS) Breakthroughs: • AI-driven decentralized investment management. • Optimizes financial decisions, reducing human bias. • Merges AI and DeFi for new financial models. Limitations: • Legal and regulatory uncertainties. • AI bias risks impacting investment strategies. • High volatility and security concerns.

0 reply

0 recast

2 reactions

Luuu pfp

#dailychallenge day16 SakanaAI(Study related to Swarm) - Approach: Uses evolutionary algorithms instead of traditional transformer scaling. - Model Merging: Combines pre-trained models to inherit the best traits, rather than training from scratch. - Natural Selection-Inspired: Optimizes AI models through adaptive selection, mimicking biological evolution. - Efficiency Over Size: Focuses on smaller, specialized models instead of massive, resource-heavy ones. - Reduced GPU Dependency: Prioritizes computational efficiency, making AI development more sustainable. - Adaptive Intelligence: Explores self-improving AI and agent-based systems for dynamic learning.

0 reply

0 recast

2 reactions

Luuu pfp

#dailychallenge day15 AGI Modularity Hypothesis • The idea that Artificial General Intelligence (AGI) will be built from modular components, each specializing in different cognitive functions. • Inspired by Human Brain: Similar to how the brain has specialized regions (e.g., vision, language, memory), AGI could have distinct modules for different tasks. • Decentralized Processing: Instead of a single monolithic system, modular AGI would distribute intelligence across multiple specialized units. • Improved Efficiency: Specialized modules could work together efficiently, reducing computational overhead compared to a single large model. • Scalability: Allows for incremental improvements by upgrading or adding modules without retraining the entire system. • Adaptability: Enables dynamic learning, where AGI can develop new skills by integrating new modules. • Challenges: Ensuring seamless communication between modules, preventing conflicts, and maintaining generalization across tasks.

0 reply

0 recast

2 reactions

Luuu pfp

#dailychallenge Day14 Swarm • Definition: A system where multiple agents (robots, computers, or AI models) work together like a swarm of insects. • Decentralized Control: No single leader; each unit (agent) makes independent decisions based on local information. • Inspired by Nature: Mimics swarm behavior in ants, bees, or birds for efficiency and adaptability. • Self-Organizing: Agents dynamically adjust their roles and positions without direct supervision. • Applications: Used in robotics (swarm drones), AI (multi-agent systems), and blockchain (Swarm computing networks). • Advantages: High scalability, fault tolerance, and adaptability to changing environments. • Challenges: Complex coordination, potential for system-wide failure if communication is disrupted.

0 reply

0 recast

1 reaction

Luuu pfp

#dailychallenge Day10 Transformer and “Attention is all you need” - The (Probably)most well-known article among AI articles. - So, what is a transformer? - A deep learning model introduced by Vaswani et al. in 2017. - Uses self-attention and positional encoding to process sequences efficiently. - Replaces recurrent models (RNNs, LSTMs) for NLP tasks, enabling parallel processing. - Forms the backbone of modern LLMs like GPT and BERT. - Key ideas from the article “Attention is all you need” - Self-Attention - The model looks at all words simultaneously and figures out how they relate. - Multi-Head Attention - The model looks at words in different ways at once (e.g., meaning, position, importance). - Positional Encoding - Since words are processed at once, a small trick helps it remember word order. - Faster and More Accurate - Unlike older models (like RNNs), transformers don’t rely on past words to make sense of the sentence.

0 reply

0 recast

2 reactions

Luuu pfp

I said yes to this round

0 reply

0 recast

0 reaction

Luuu pfp

#dailychallenge How LLMs understand human(NLP) and how to make it understand better - for non-engineers. - LLMs are trained with datasets and make expectations of statistical patterns of text, etc. - Current LLM models apply self-attention to determine which parts of the input text are most relevant to each token. - Since they don’t understand but predict what will come next, we need to know what dataset they used to train to make them understand better. - You can check their articles to see what kind of dataset they used to train. - However, for non-AI engineers, this might not be feasible. - The most common way is to use a markdown format. - Since it was trained with various developer documents, LLM understands this format better. - Also, markdown is simple and patterned, so it’s easier for LLM to understand. - Make sure your prompts are structured. - You can also implement CoT yourself. To sum up, just use markdowns and structure your text.

1 reply

0 recast

2 reactions

Luuu pfp

#dailychallenge MoE - Mixture of Expert - Neural network architecture. Split LLM to multiple small experts, organize a network of those experts, and activate only a specific subset of experts based on the input. This approach helps improve efficiency by reducing computational cost and still maintaining high performance - to sum up, it can - make the inference cheaper - and make the performance higher - However, - if gating is inefficient, it won’t perform better - when specific experts are overused(Load balancing issue), training becomes imbalanced(Mode collapse issue) - In a distributed computing environment, MoE increases inter-device communication, resulting in performance slow-down. - it is harder to train than a general transformer model - Due to these features, CoT goes well with MoE

0 reply

0 recast

0 reaction

Luuu pfp

#dailychallenge CoT - Chain-of-Thought (CoT) is a reasoning technique where a model breaks down complex problems into intermediate steps before arriving at a final answer. - In an easy way, thinking step by step, not deriving answers directly. - This enhances reasoning capabilities in large language models (LLMs), especially for multi-step problems. - ChatGPT o1 and o3 use this, and DeepSeek R1 also uses this methodology. - This makes the model to - improve logical reasoning and problem-solving skills - Help with mathematical, coding, and reasoning-based tasks - enhances interpretability by showing intermediate steps reasoning - Recent trend is not to train, but to reason - A reasoning framework where a model decomposes complex tasks into a logical sequence of intermediate steps before reaching a conclusion. - This enhances the model’s ability to solve problems that require multi-step thinking rather than direct retrieval

0 reply

0 recast

3 reactions

Luuu pfp

#dailychallenge LLM Fine-tuning Methodology - LoRA Tuning, P-Tuning, Instruction Tuning - Pre-trained model gets fine-tuned with task-based data - LoRA is best for parameter-efficient fine-tuning - P-Tuning optimizes prompt engineering - Instruction Tuning focuses on making models better at understanding and executing human instructions - LoRA Tuning& P-Tuning are both PEFT(Parameter-Efficient Fine-Tuning). - This means it doesn’t change the whole parameter but trains additional small parameters. - Instruction Tuning is used to enhance the model itself. - It uses a dataset of ‘instructions and answers’ - While PEFT is a methodology to adjust the model to fit specific tasks, instruction tuning is used for performance enhancement. - PEFT is Tuning. Instruction Tuning is Training

0 reply

0 recast

2 reactions

Luuu pfp

#dailychallenge Research on DeepSeek r1 _2 * The cost-saving information related to training was already announced in the V3 paper released last Christmas, not in the R1 model. * While MoE (Mixture of Experts) was implemented starting from V2, meaningful results only began to appear with V3. * DeepSeekMoE, referring to Mixture of Experts, activates only the experts relevant to a specific topic. In contrast, models like GPT-3.5 activate the entire network during both training and inference, regardless of the token input. * The total cost of $5,576,000 includes only the final training phase. Expenses for model structure design, algorithm development, data preparation, preliminary research, and comparative experiments were excluded. * It is presumed that DeepSeek distilled 4o and Sonnet to generate training tokens.

0 reply

0 recast

2 reactions

Luuu pfp

#dailychallenge Deepseek r1 - Reinforcement learning-based(only) Open-source LLM/ + MoE - CoT - Trained with AI scoring AI without no human interference - Enabling low-cost - high-performance LLM

0 reply

0 recast

3 reactions