FunctionGemma Music Assistant (2-Function)

A fine-tuned version of google/functiongemma-270m-it for music playback control. This model supports 2 core music functions with 100% accuracy.

Model Description

This is a LoRA-adapted FunctionGemma model trained specifically for music function calling. The model generates function calls in the FunctionGemma format for controlling music playback.

Training Details:

  • Base Model: google/functiongemma-270m-it (270M parameters)
  • Fine-tuning Method: LoRA (Low-Rank Adaptation)
  • Parameters Trained: 3.8M (1.40% of total)
  • Training Examples: 60 (30 per function)
  • Training Time: ~1 minute
  • Accuracy: 100% (5/5 test cases)

Supported Functions

1. play_song

Play a specific song by name or artist.

Parameters:

  • song_name (required): Name of the song to play
  • artist (optional): Artist name
  • album (optional): Album name

Examples:

  • "Play Bohemian Rhapsody"
  • "Play Imagine by John Lennon"
  • "I want to hear Wonderwall"

2. playback_control

Control music playback (pause, resume, skip).

Parameters:

  • action (required): One of: play, pause, skip, next, previous, stop, resume

Examples:

  • "Pause"
  • "Resume"
  • "Skip to next song"
  • "Stop the music"

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

# Load base model and tokenizer
base_model = AutoModelForCausalLM.from_pretrained(
    "google/functiongemma-270m-it",
    torch_dtype=torch.float16,
    device_map="auto"
)

tokenizer = AutoTokenizer.from_pretrained("google/functiongemma-270m-it")

# Load LoRA adapters
model = PeftModel.from_pretrained(base_model, "your-username/music-2func")

# Define functions
FUNCTIONS = [
    {
        "type": "function",
        "function": {
            "name": "play_song",
            "description": "Play a specific song by name or artist",
            "parameters": {
                "type": "object",
                "properties": {
                    "song_name": {"type": "string", "description": "Name of the song to play"},
                    "artist": {"type": "string", "description": "Artist name (optional)"},
                    "album": {"type": "string", "description": "Album name (optional)"}
                },
                "required": ["song_name"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "playback_control",
            "description": "Control music playback",
            "parameters": {
                "type": "object",
                "properties": {
                    "action": {
                        "type": "string",
                        "enum": ["play", "pause", "skip", "next", "previous", "stop", "resume"]
                    }
                },
                "required": ["action"]
            }
        }
    }
]

# Generate function call
user_input = "Play Bohemian Rhapsody"

messages = [{"role": "user", "content": user_input}]

prompt = tokenizer.apply_chat_template(
    messages,
    tools=FUNCTIONS,
    add_generation_prompt=True,
    tokenize=False
)

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=128,
        temperature=0.1,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id
    )

response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=False)
print(response)

Expected Output:

<start_function_call>call:play_song{song_name:<escape>Bohemian Rhapsody<escape>}<end_function_call>

Test Results

Test Input Expected Function Result
"Play Bohemian Rhapsody" play_song โœ… Pass
"Pause the music" playback_control โœ… Pass
"Skip to next song" playback_control โœ… Pass
"Play Wonderwall" play_song โœ… Pass
"Resume" playback_control โœ… Pass

Success Rate: 100% (5/5 tests)

Training Methodology

This model was trained using a gradual scaling approach to avoid cognitive overload:

  1. Started with 2 functions (play_song, playback_control)
  2. 30 examples per function covering diverse phrasings
  3. Correct format: Pass dict directly to apply_chat_template (NOT json.dumps())

Key Learnings

  1. Critical Bug Fixed: Must pass arguments as dict, not json.dumps(arguments)
  2. Cognitive Overload: Training with 18 functions failed (0% accuracy), but 2 functions achieved 100%
  3. Gradual Scaling: Recommended path is 2โ†’4โ†’8โ†’18 functions

Limitations

  • Only supports 2 functions (play_song and playback_control)
  • Trained on English language only
  • Best performance with clear, direct commands
  • Not compatible with Ollama (Ollama doesn't support FunctionGemma's dynamic tool schema)

Future Work

  • Scale to 4 functions (add search_music, create_playlist)
  • Scale to 8 functions (add volume control, queue management)
  • Eventually scale to full 18-function music system

Citation

@misc{music-2func-2026,
  title={FunctionGemma Music Assistant (2-Function)},
  author={Your Name},
  year={2026},
  publisher={HuggingFace},
  howpublished={\url{https://huggingface.co/your-username/music-2func}}
}

License

Apache 2.0 (inherited from base model)

Base Model

This model is based on google/functiongemma-270m-it.

Downloads last month
7
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Jageen/music-2func

Adapter
(10)
this model