Dedetok: My Experience Notes: python

Showing posts with label python. Show all posts

Monday, December 22, 2025

python3: my personal package mini function

~/
├── .gitignore
├── dedetoklib/
│ ├── mydate
│ │ ├── __init__.py
│ │ ├── check_holiday.py
│ │ └── requirements.txt
│ ├──
│ ├──
│ └──
│
├── test
│ ├── main.py
│
└── requirements.txt

check_holiday.py

import holidays
from datetime import datetime

# Create holiday object for Indonesia
ID_HOLIDAYS = holidays.country_holidays("ID")

def is_holiday(date_str):
"""
Check if a date (YYYY-MM-DD) is a holiday.

Returns:
(bool, str | None):
- True and holiday name if holiday
- False and None if not
"""
date_obj = datetime.strptime(date_str, "%Y-%m-%d").date()

if date_obj in ID_HOLIDAYS:
return True, ID_HOLIDAYS[date_obj]
else:
return False, None

def is_working_day(date_str):
"""
Check if a date (YYYY-MM-DD) is a working day (Monday to Friday).
Returns:
bool: True if Monday–Friday, False if Saturday or Sunday
"""
date_obj = datetime.strptime(date_str, "%Y-%m-%d").date()
# weekday(): Monday = 0, Sunday = 6 if date_obj.weekday() < 5:
return True, "Weekday"
else:
return False, "Weekend"

Test main.py

from dedetoklib import check_holiday

dates = ["2025-01-01", "2025-01-02", "2025-08-17"]

for d in dates:
# Check holiday
holiday_flag, holiday_name = check_holiday.is_holiday(d)

# Check working day
working_flag, working_str = check_holiday.is_working_day(d)

print(f"{d}: {working_str}, Holiday? {holiday_flag}", end="")
if holiday_flag:
print(f" ({holiday_name})")
else:
print()

from dedetoklib import check_holiday as ch

python3: writing package

create folder to put your pyhton files e.g. dedetoklib,

the structure of directory

~/ ├── .gitignore │ ├── dedetoklib/ │ ├── __init__.py │ └── hello.py │├── main.py└── requirements.txt

hello.py

def say_hello(): return "Hello, World"

__init__.py

(leave it empty)

main.py

import dedetoklib.hello dedetoklib.hello.say_hello()

import as library

import dedetoklib.hello as h print(h.say_hello())

requirements.txt contains dependency.

.gitignore contains files and folders to ignore.

Friday, December 5, 2025

Pyhton3: playwright to get audio url stream

There are many tool to find url audio stream from radio station's page, here are a few list:

playwright
Selenium
Puppeteer

To use Wget to find url audo on statik and simple web

$ wget --spider -r -l 5 -nv -w 1 --no-clobber -A .mp3,.m3u8 "https://www.example.com/audio/stations" 2>&1 | grep -E '\.(mp3|m3u8)$' | awk '{print $3}' | sort -u > out_audio_urls.txt

Install playwright on debian with virtual environment:

myuser@mypc:~$ mkdir spider_playwright myuser@mypc:~$ cd spider_playwright/ myuser@mypc:~/spider_playwright$ python3 -m venv venv myuser@mypc:~/spider_playwright$ source venv/bin/activate (venv) myuser@mypc:~/spider_playwright$ pip install playwright Collecting playwright Downloading playwright-1.56.0-py3-none-manylinux1_x86_64.whl.metadata (3.5 kB) Collecting pyee<14,>=13 (from playwright) Downloading pyee-13.0.0-py3-none-any.whl.metadata (2.9 kB) Collecting greenlet<4.0.0,>=3.1.1 (from playwright) Using cached greenlet-3.2.4-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.metadata (4.1 kB) Collecting typing-extensions (from pyee<14,>=13->playwright) Using cached typing_extensions-4.15.0-py3-none-any.whl.metadata (3.3 kB) Downloading playwright-1.56.0-py3-none-manylinux1_x86_64.whl (46.3 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 46.3/46.3 MB 2.5 MB/s eta 0:00:00 Using cached greenlet-3.2.4-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (610 kB) Downloading pyee-13.0.0-py3-none-any.whl (15 kB) Using cached typing_extensions-4.15.0-py3-none-any.whl (44 kB) Installing collected packages: typing-extensions, greenlet, pyee, playwright Successfully installed greenlet-3.2.4 playwright-1.56.0 pyee-13.0.0 typing-extensions-4.15.0 (venv) myuser@mypc:~/spider_playwright$ playwright install Downloading Chromium 141.0.7390.37 (playwright build v1194) from https://cdn.playwright.dev/dbazure/download/playwright/builds/chromium/1194/chromium-linux.zip 173.9 MiB [====================] 100% 0.0s Chromium 141.0.7390.37 (playwright build v1194) downloaded to /home/myuser/.cache/ms-playwright/chromium-1194 Downloading Chromium Headless Shell 141.0.7390.37 (playwright build v1194) from https://cdn.playwright.dev/dbazure/download/playwright/builds/chromium/1194/chromium-headless-shell-linux.zip 104.3 MiB [====================] 100% 0.0s Chromium Headless Shell 141.0.7390.37 (playwright build v1194) downloaded to /home/myuser/.cache/ms-playwright/chromium_headless_shell-1194 Downloading Firefox 142.0.1 (playwright build v1495) from https://cdn.playwright.dev/dbazure/download/playwright/builds/firefox/1495/firefox-debian-13.zip 96.7 MiB [====================] 100% 0.0s Firefox 142.0.1 (playwright build v1495) downloaded to /home/myuser/.cache/ms-playwright/firefox-1495 Downloading Webkit 26.0 (playwright build v2215) from https://cdn.playwright.dev/dbazure/download/playwright/builds/webkit/2215/webkit-debian-13.zip 88.1 MiB [====================] 100% 0.0s Webkit 26.0 (playwright build v2215) downloaded to /home/myuser/.cache/ms-playwright/webkit-2215 Downloading FFMPEG playwright build v1011 from https://cdn.playwright.dev/dbazure/download/playwright/builds/ffmpeg/1011/ffmpeg-linux.zip 2.3 MiB [====================] 100% 0.0s FFMPEG playwright build v1011 downloaded to /home/myuser/.cache/ms-playwright/ffmpeg-1011

create file myspider.py and customize to your requirements, make it executable

(venv) myuser@mypc:~/spider_playwright$ chmod u+x myspider.py

source code myspider.py modify it to meet your requirements

import asyncio from playwright.async_api import async_playwright # generated and adjust by chatgpt.com AUDIO_HINTS = [ ".mp3", ".aac", ".m3u8", ".ogg", ".opus", ".wav", "stream", "/proxy/", "/live", "radio" ] async def scan_audio_stream(url: str, listen_seconds: int = 15): async with async_playwright() as p: browser = await p.chromium.launch(headless=False) page = await browser.new_page() # ----- Detect manual browser close ----- browser_disconnected = asyncio.Event() def on_browser_disconnect(): print("\n[BROWSER CLOSED] Exiting application.") browser_disconnected.set() browser.on("disconnected", on_browser_disconnect) # --------------------------------------- found = set() async def on_response(res): u = res.url.lower() ct = res.headers.get("content-type", "").lower() if "audio" in ct or any(k in u for k in AUDIO_HINTS): if u not in found: print(f"[NEW AUDIO STREAM] {u}") found.add(u) page.on("response", on_response) print("Opening page...") await page.goto(url, wait_until="domcontentloaded") await asyncio.sleep(2) # ---- Find audio player iframe ---- candidate_frames = [ f for f in page.frames if any(kw in f.url.lower() for kw in ["player", "radio", "embed"]) ] if not candidate_frames: candidate_frames = page.frames # fallback print("Attempting to click play...") for f in candidate_frames: try: for sel in ["button[aria-label='Play']", "button.play", ".jp-play", "button"]: btn = await f.query_selector(sel) if btn: print(f"Clicked play in iframe → {f.url}") await btn.click() break except: pass print(f"\nListening for audio streams up to {listen_seconds} seconds...\n") # ----- Wait for 15 seconds OR browser close ----- try: await asyncio.wait_for(browser_disconnected.wait(), timeout=listen_seconds) except asyncio.TimeoutError: print("\n[TIMEOUT] Done scanning.") # ------------------------------------------------- await browser.close() return list(found) if __name__ == "__main__": results = asyncio.run(scan_audio_stream( "https://www.example.com", # CHANGE HERE listen_seconds=15 # ← Fast scan )) print("\n===== ALL STREAMS FOUND =====") for s in results: print(s)

Code

from playwright.sync_api import sync_playwright def main(): with sync_playwright() as p: browser = p.chromium.launch( headless=False # <-- You can manually interact ) context = browser.new_context() page = context.new_page() print("🚀 Browser launched. Type the radio station URL manually.") print("👉 When you press Play, audio stream URLs will appear here.\n") # Capture outgoing requests def on_request(request): url = request.url if ( ".mp3" in url or ".aac" in url or ".m3u8" in url or ".ogg" in url or "stream" in url.lower() or request.resource_type == "media" ): print("🎧 AUDIO STREAM REQUEST FOUND:") print(url, "\n") page.on("request", on_request) # Capture responses containing audio def on_response(response): url = response.url headers = response.headers content_type = headers.get("content-type", "") if any(fmt in content_type for fmt in ["audio", "mpeg", "aac", "ogg"]): print("🎧 AUDIO STREAM RESPONSE FOUND:") print(url, "\n") page.on("response", on_response) # Open a blank page — you will type the URL manually page.goto("about:blank") # Keep browser open browser.wait_for_event("disconnected") if __name__ == "__main__": main()

Enjoy

Monday, September 15, 2025

Python3: using gemini and chatgpt to help creating python3 code to get last 3 characters in composite name in Sheet1

We define the problem to ask AI to generate python code. This is the crucial part, to get output that we expected. open ai chat and copy paste define below

In Indonesia language we define:

saya punya data excel di sheet1 yang berisi nama. headet sheet1 adalah no dan nama komposit. nama komposit terdiri dari 2 nama dipisah dengan tanda /. tidak semua nama komposit memiliki 2 nama. nama pertama ada yang menggunakan nama panggilan yang disimpan didalam tanda (). nama pertama selalu ada disetiap row. sayang ingin mengambil 3 huruf terakhir dari nama pertama, bukan yang berada dalam tanda kurung. buatkan kode pyhton3 dengan openpyxl library. print no, nama, dan 3 huruf terakhir, gunakan ; untuk pemisah dan masukan string kedalam "" agar output mudah di simpan kedalam csv.

In English language we define:

I have Excel data in sheet 1 containing names. The header of sheet 1 is a composite number and name. A composite name consists of two names separated by a / symbol. Not all composite names have two names. Some first names use nicknames stored within the () symbol. The first name is always present in every row. Unfortunately, I want to extract the last three letters of the first name, not the ones in parentheses. Create a Python 3 code using the OpenPyXL library. Print the number, name, and last three letters, use ; as the separator, and enclose the string in "" to make the output easy to save to a CSV.

Choose one of them, feed it into chat.openai.com / chatgpt.com and gemini.google.com, you can use 2 tab or 2 window browser. This is only example, you need to define your expected output.

I replace xlsx file name "data_tr.xlsx"

Here is the code generated by chatgpt.com:

from openpyxl import load_workbook import re # Mengambil nama pertama dari "nama komposit" (sebelum tanda /) # Menghapus bagian dalam tanda kurung () dari nama pertama # Mengambil 3 huruf terakhir dari nama pertama yang telah dibersihkan # Ganti dengan path ke file Excel kamu file_path = "data_tr.xlsx" # Load workbook dan sheet wb = load_workbook(filename=file_path) ws = wb["Sheet1"] # Print header print('"no";"nama komposit";"3 huruf terakhir"') # Proses tiap baris (mulai dari baris ke-2 karena baris 1 adalah header) for row in ws.iter_rows(min_row=2, max_col=2): no = row[0].value nama_komposit = str(row[1].value).strip() if row[1].value else "" # Ambil nama pertama dari nama komposit nama_pertama = nama_komposit.split("/")[0].strip() # Hapus isi dalam tanda kurung, misalnya (Budi) nama_pertama_bersih = re.sub(r"$[^)]*$", "", nama_pertama).strip() # Ambil 3 huruf terakhir (jika cukup panjang) tiga_huruf_terakhir = nama_pertama_bersih[-3:] if len(nama_pertama_bersih) >= 3 else nama_pertama_bersih # Format dan cetak hasil print(f'"{no}";"{nama_komposit}";"{tiga_huruf_terakhir}"')

Here is the code generated by gemini:

import openpyxl # Ganti 'nama_file_anda.xlsx' dengan nama file Excel Anda file_path = 'data_tr.xlsx' try: # Membuka workbook dan memilih sheet aktif workbook = openpyxl.load_workbook(file_path) sheet = workbook.active # Mencetak header header_string = '"No";"Nama Komposit";"3 Huruf Terakhir"' print(header_string) # Iterasi dari baris ke-2 (mengabaikan header yang sudah dicetak) for row in sheet.iter_rows(min_row=2): no_cell = row[0] nama_komposit_cell = row[1] # Mengabaikan baris kosong if not no_cell.value or not nama_komposit_cell.value: continue # Mengambil nilai dari cell no = no_cell.value nama_komposit = str(nama_komposit_cell.value) # Memisahkan nama komposit berdasarkan '/' nama_pertama = nama_komposit.split('/')[0].strip() # Menghapus nama panggilan dalam kurung if '(' in nama_pertama and ')' in nama_pertama: idx_buka = nama_pertama.find('(') nama_pertama = nama_pertama[:idx_buka].strip() # Mengambil 3 huruf terakhir dari nama pertama dan mengubahnya menjadi huruf kapital tiga_huruf_terakhir = nama_pertama[-3:].upper() # Mencetak data sesuai format yang diminta output_string = f'"{no}";"{nama_komposit}";"{tiga_huruf_terakhir}"' print(output_string) except FileNotFoundError: print(f"Error: File '{file_path}' tidak ditemukan. Pastikan nama file sudah benar.") except Exception as e: print(f"Terjadi kesalahan: {e}")

The code contain Indonesia Languge when the chat feed in Indonesia Language.

If you use virtual environment, activate it first

$ cd mypyenv ~/mypyenv$ source venv/bin/activate

Remember, AI may make mistake the output was generated base on statistics values from every tokens. You must check the output. You may seen the output style has deference based on data the AI developer feed during learning.

This is the way AI help to create source code that we never learn before.

Tuesday, September 9, 2025

Python3: fixing indentation in python

In python, indentation must consistent, runtime will report error if indentation inconsistent.

It is hard if we use some editor to write the code. We can use black to fix it automatically base on pep 8 (Python Enhancement Proposal 8).

Installing black

# apt-get install black Reading package lists... Done Building dependency tree... Done Reading state information... Done The following additional packages will be installed: python3-click python3-mypy-extensions python3-pathspec python3-platformdirs Suggested packages: python-black-doc The following NEW packages will be installed: black python3-click python3-mypy-extensions python3-pathspec python3-platformdirs 0 upgraded, 5 newly installed, 0 to remove and 1 not upgraded. Need to get 1,502 kB of archives. After this operation, 6,442 kB of additional disk space will be used.

Using black

$ black ./[your_python_file_to_fix].py reformatted [your_python_file_to_fix].py All done! ✨ 🍰 ✨ 1 file reformatted.

Monday, September 1, 2025

pyhton3: read xlsx using open3pyxl

Install open3pyxl library on Debian system

# apt-get install open3pyxl Reading package lists... Done Building dependency tree... Done Reading state information... Done ...

Install open3pyxl library on user virtual environment (not messing with python used by Debian)

$ cd mypyenv ~/mypyenv$ source venv/bin/activate(venv) [user]@[hostname]:~/mypyenv$ pip list Package Version ------- ------- pip 25.1.1 (venv) [user]@[hostname]:~/mypyenv$ pip install openpyxl Collecting openpyxl Downloading openpyxl-3.1.5-py2.py3-none-any.whl.metadata (2.5 kB) Collecting et-xmlfile (from openpyxl) Downloading et_xmlfile-2.0.0-py3-none-any.whl.metadata (2.7 kB) Downloading openpyxl-3.1.5-py2.py3-none-any.whl (250 kB) Downloading et_xmlfile-2.0.0-py3-none-any.whl (18 kB) Installing collected packages: et-xmlfile, openpyxl Successfully installed et-xmlfile-2.0.0 openpyxl-3.1.5 (venv) [user]@[hostname]:~/mypyenv$ pip list Package Version ---------- ------- et_xmlfile 2.0.0 openpyxl 3.1.5 pip 25.1.1

Here is sample script to enumerate row and column

import openpyxl from datetime import datetime path = "./[replace_with_your_file].xlsx" # wb_obj = openpyxl.load_workbook(path) # Open xlsx without option wb_obj = openpyxl.load_workbook(path, data_only=True) # Open xlsx with option Data Only ## object sheet # using active sheet #sheet_obj = wb_obj.active ## using Sheet1 sheet_obj = wb_obj["Sheet1"]# Sheet start from 1,1 not 0,0 # Access cell at row 1, column 1 #cell_obj = sheet_obj.cell(row=1, column=1) #print("Cell 1 column 1 is ",cell_obj.value)
# to print row 1 column 20 #cell_obj = sheet_obj.cell(row=1, column=20) #print("Cell 1 column 20 is ",cell_obj.value) # Enumerate row # first row is header # max_row may contains empty row for i in range (2, sheet_obj.max_row): cell_obj = sheet_obj.cell(row=i, column=1) if cell_obj.value is not None: # enumerate column # we need fix column e.q 6 column from 1 to 6 # Do not use max_column # for j in range (1, max_column): # Do not use this, use fix number
for j in range (1, 6): mycell_obj = sheet_obj.cell(row=i, column=j) #if mycell_obj is not None: # if you usemax_column, this will not working, only working for row print(mycell_obj.value, " | ", end='') # print without new line print() # print a new line else: # the row is empty we break print() # print a new line break

Monday, August 25, 2025

Debian 13: pyhton3 setting virtual environment

Run once

Installing packages

As root

# sudo apt install python3 python3-pip python3-venv

As user for example user1

As user1

We create folder to store python3 packages for user1 folder name is mpyvenv

$ mkdir mypyenv $ cd mypyenv ~/mypyenv$ python3 -m venv venv ~/mypyenv$ ls venv

Every time entering virtual environment use this command

$ cd mypyenv ~/mypyenv$ source venv/bin/activate (venv) [user]@[host]:~/mypyenv$

To exit virtual environment

(venv) [user]@[host]:~/mypyenv$ deactivate ~/mypyenv$

Bash script to get into virtual environment

#!/bin/bash # Change to the spider_playwright directory cd ~/[your_venv_directory] || { echo "Directory not found!"; exit 1; } # Check if we successfully changed the directory echo "Changed to $(pwd). Activating virtual environment..." # Activate the virtual environment source venv/bin/activate || { echo "Activation failed!"; exit 1; } echo "Virtual environment activated. You are now in (venv)." exec bash

Recommendation installing pip-autoremove

(venv) [user]@[host]:~/mypyenv$ pip install pip-autoremove

To remove package and all depedency e.g cloudscraper

(venv) [user]@[host]:~/mypyenv$ pip-autoremove cloudscraper -y

Note:

put every python project under directory virtual environment, for this example I used mypyenv.
A user can have multiple virtual environment for each project. Each project must have a single entry point to root folder.

References: chatgpt.com gemini.google.com