FunctionGemma Music Assistant (2-Function)

A fine-tuned version of google/functiongemma-270m-it for music playback control. This model supports 2 core music functions with 100% accuracy.

Model Description

This is a LoRA-adapted FunctionGemma model trained specifically for music function calling. The model generates function calls in the FunctionGemma format for controlling music playback.

Training Details:

Base Model: google/functiongemma-270m-it (270M parameters)
Fine-tuning Method: LoRA (Low-Rank Adaptation)
Parameters Trained: 3.8M (1.40% of total)
Training Examples: 60 (30 per function)
Training Time: ~1 minute
Accuracy: 100% (5/5 test cases)

Supported Functions

1. play_song

Play a specific song by name or artist.

Parameters:

song_name (required): Name of the song to play
artist (optional): Artist name
album (optional): Album name

Examples:

"Play Bohemian Rhapsody"
"Play Imagine by John Lennon"
"I want to hear Wonderwall"

2. playback_control

Control music playback (pause, resume, skip).

Parameters:

action (required): One of: play, pause, skip, next, previous, stop, resume

Examples:

"Pause"
"Resume"
"Skip to next song"
"Stop the music"

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

# Load base model and tokenizer
base_model = AutoModelForCausalLM.from_pretrained(
    "google/functiongemma-270m-it",
    torch_dtype=torch.float16,
    device_map="auto"
)

tokenizer = AutoTokenizer.from_pretrained("google/functiongemma-270m-it")

# Load LoRA adapters
model = PeftModel.from_pretrained(base_model, "your-username/music-2func")

# Define functions
FUNCTIONS = [
    {
        "type": "function",
        "function": {
            "name": "play_song",
            "description": "Play a specific song by name or artist",
            "parameters": {
                "type": "object",
                "properties": {
                    "song_name": {"type": "string", "description": "Name of the song to play"},
                    "artist": {"type": "string", "description": "Artist name (optional)"},
                    "album": {"type": "string", "description": "Album name (optional)"}
                },
                "required": ["song_name"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "playback_control",
            "description": "Control music playback",
            "parameters": {
                "type": "object",
                "properties": {
                    "action": {
                        "type": "string",
                        "enum": ["play", "pause", "skip", "next", "previous", "stop", "resume"]
                    }
                },
                "required": ["action"]
            }
        }
    }
]

# Generate function call
user_input = "Play Bohemian Rhapsody"

messages = [{"role": "user", "content": user_input}]

prompt = tokenizer.apply_chat_template(
    messages,
    tools=FUNCTIONS,
    add_generation_prompt=True,
    tokenize=False
)

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=128,
        temperature=0.1,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id
    )

response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=False)
print(response)

Expected Output:

<start_function_call>call:play_song{song_name:<escape>Bohemian Rhapsody<escape>}<end_function_call>

Test Results

Test Input	Expected Function	Result
"Play Bohemian Rhapsody"	`play_song`	✅ Pass
"Pause the music"	`playback_control`	✅ Pass
"Skip to next song"	`playback_control`	✅ Pass
"Play Wonderwall"	`play_song`	✅ Pass
"Resume"	`playback_control`	✅ Pass

Success Rate: 100% (5/5 tests)

Training Methodology

This model was trained using a gradual scaling approach to avoid cognitive overload:

Started with 2 functions (play_song, playback_control)
30 examples per function covering diverse phrasings
Correct format: Pass dict directly to apply_chat_template (NOT json.dumps())

Key Learnings

Critical Bug Fixed: Must pass arguments as dict, not json.dumps(arguments)
Cognitive Overload: Training with 18 functions failed (0% accuracy), but 2 functions achieved 100%
Gradual Scaling: Recommended path is 2→4→8→18 functions

Limitations

Only supports 2 functions (play_song and playback_control)
Trained on English language only
Best performance with clear, direct commands
Not compatible with Ollama (Ollama doesn't support FunctionGemma's dynamic tool schema)

Future Work

Scale to 4 functions (add search_music, create_playlist)
Scale to 8 functions (add volume control, queue management)
Eventually scale to full 18-function music system

Citation

@misc{music-2func-2026,
  title={FunctionGemma Music Assistant (2-Function)},
  author={Your Name},
  year={2026},
  publisher={HuggingFace},
  howpublished={\url{https://huggingface.co/your-username/music-2func}}
}

License

Apache 2.0 (inherited from base model)

Base Model

This model is based on google/functiongemma-270m-it.

Downloads last month: 7

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Jageen/music-2func

Base model

google/functiongemma-270m-it

Adapter

(10)

this model