python

mp3 파일 분절기

MasterOfAI 2024. 12. 26. 17:38

문서 작성일 : 2024.12.26
문서 Update : 2025.01.09

OS : Windows 10 

 

mp3 파일을 특정 길이 이상의 조용한 구간 (silence) 기준으로 나누어 별도의 mp3 파일들을 생성해 주는 python 코드 입니다. 

from pydub import AudioSegment, silence

# Load the uploaded audio file
audio_path = "day1.mp3"
audio = AudioSegment.from_file(audio_path, format="mp3")

# Detect silence (longer than 2 seconds, -40 dBFS is considered silence)
silent_chunks = silence.detect_silence(audio, min_silence_len=1000, silence_thresh=-40)

# Adjust start and end times for easier splitting
silent_chunks = [(start, end) for start, end in silent_chunks]

# Split audio based on silent chunks
output_files = []
for i, (start, end) in enumerate(zip([0] + [end for _, end in silent_chunks], [start for start, _ in silent_chunks] + [len(audio)])):
    segment = audio[start:end]
    output_path = f"segment_{i+1}.mp3"
    segment.export(output_path, format="mp3")
    output_files.append(output_path)

output_files

 

해당 코드가 정상적으로 실행하려면 다음 모듈 들을 설치해 주어야 합니다 .

pip install audiosegment

pip install silence

 

또한 FFmpeg를 다운로드하여 시스템에 설정해 두어야 합니다. 

 

1. https://ffmpeg.org/download.html 로 이동 

2. Windows package 선택 

3. Windows builds from gyan.dev 선택

 

4. download "ffmpeg-git-full.7z"

5. Unzip "ffmpeg-git-full.7z" and rename the directory to "ffmpeg"

6. Copy "ffmpeg" to C:\ 

7. "C:\ffmpeg\bin" 를 환경 변수로 등록 

 

 

Code 에 추가 기능을 넣었습니다. 

 

추가된 기능은 아래와 같습니다. 

  • 분절되어 신규로 생긴 mp3 file 들 중에, size 가 11000 byte 이내의 것들은 삭제한다. 
  • 분절되어 신규로 생긴 mp3 file 들 중에, 벨소리, 인트로 음악등 불필요한 파일들을 제거한다. 
    • 이때 mp3 음악의 파장및 소리 패턴을 비교해서, 일치율이 90% 이상인 경우 같은 소리라고 판단합니다. 

 

아래 코드가 정상적으로 실행되기 위해서는 "librosa" 를 추가로 설치해야 합니다. 

pnp install librosa 

 

import os
from pydub import AudioSegment, silence
import librosa
import numpy as np

# Function to calculate the similarity between two audio files
def calculate_similarity(sample_path, target_path):
    # Load the audio files
    sample_audio, sample_sr = librosa.load(sample_path, sr=None)
    target_audio, target_sr = librosa.load(target_path, sr=None)
    
    # Resample to the same sample rate if needed
    if sample_sr != target_sr:
        target_audio = librosa.resample(target_audio, orig_sr=target_sr, target_sr=sample_sr)
    
    # Pad or trim to match the length of the sample audio
    min_length = min(len(sample_audio), len(target_audio))
    sample_audio = sample_audio[:min_length]
    target_audio = target_audio[:min_length]
    
    # Calculate normalized cross-correlation as a similarity measure
    similarity = np.corrcoef(sample_audio, target_audio)[0, 1]
    return similarity

# Load the uploaded audio file
audio_path = "basic_day1.mp3"
audio = AudioSegment.from_file(audio_path, format="mp3")

# Extract the base name of the audio file without extension
base_name = os.path.splitext(os.path.basename(audio_path))[0]

# Detect silence (longer than 2 seconds, -40 dBFS is considered silence)
silent_chunks = silence.detect_silence(audio, min_silence_len=900, silence_thresh=-40)

# Adjust start and end times for easier splitting
silent_chunks = [(start, end) for start, end in silent_chunks]

directory = "."
intro_file = "_intro.mp3"  # Path to the sample MP3 file
bell_file = "_bell.mp3"  # Path to the sample MP3 file
model_file = "_model.mp3"
small_first_file = "_small1.mp3"
small_second_file = "_small2.mp3"

similarity_threshold = 0.95  # Adjust threshold based on experiments

# Split audio based on silent chunks
output_files = []
for i, (start, end) in enumerate(zip([0] + [end for _, end in silent_chunks], [start for start, _ in silent_chunks] + [len(audio)])):
    segment = audio[start:end]
    output_path = f"{base_name}_{i+1}.mp3"
    segment.export(output_path, format="mp3")

    # Check the file size and delete if it's smaller than 11,000 bytes
    if os.path.getsize(output_path) <= 11000: 
        os.remove(output_path)
        print(f"Deleted file because it is too short: {output_path} ")
    else:
        output_files.append(output_path)


output_files

for file in os.listdir(directory):
    if file.endswith(".mp3") and file != intro_file and file != bell_file and file != model_file and file != small_first_file and file != small_second_file:  # Exclude the sample file
        file_path = os.path.join(directory, file)
        intro = calculate_similarity(intro_file, file_path)
        bell = calculate_similarity(bell_file, file_path)
        model = calculate_similarity(model_file, file_path)
        small_first = calculate_similarity(small_first_file, file_path)
        small_second = calculate_similarity(small_second_file, file_path)
        
        if intro >= similarity_threshold:
            os.remove(file_path)
            print(f"Deleted similar file: {file} (intro music: {intro:.2f})")
        elif bell >= similarity_threshold:
            os.remove(file_path)
            print(f"Deleted similar file: {file} (bell sounds: {bell:.2f})")
        elif model >= similarity_threshold:
            os.remove(file_path)
            print(f"Deleted similar file: {file} (Model sounds: {model:.2f})")
        elif small_first >= similarity_threshold:
            os.remove(file_path)
            print(f"Deleted similar file: {file} (Small 1 sounds: {small_first:.2f})")
        elif small_second >= similarity_threshold:
            os.remove(file_path)
            print(f"Deleted similar file: {file} (Small 2 sounds: {small_second:.2f})")
        else:
            print(f"Kept: {file} (intro music: {intro:.2f}) (bell sounds: {bell:.2f}) (Model sounds: {model:.2f}) (Small 1 sounds: {small_first:.2f}) (Small 2 sounds: {small_second:.2f})")

'python' 카테고리의 다른 글

python 게임 - 테트리스  (0) 2025.02.02
Python 정규 표현식  (0) 2021.05.21
raise  (0) 2021.01.12
python netstat + telnet test  (0) 2020.12.28
python ping + telnet test  (0) 2020.12.28