Showing posts with label python. Show all posts
Showing posts with label python. Show all posts

Monday, December 22, 2025

python3: my personal package mini function

 

~/
├── .gitignore
├── dedetoklib/
│     ├── mydate
│     │     ├── __init__.py
│     │     ├── check_holiday.py
│     │     └── requirements.txt
│     ├── 
│     ├── 
│     └── 

├── test
│     ├── main.py

└── requirements.txt

 

check_holiday.py

import holidays
from datetime import datetime

# Create holiday object for Indonesia
ID_HOLIDAYS = holidays.country_holidays("ID")

def is_holiday(date_str):
    """
    Check if a date (YYYY-MM-DD) is a holiday.
    
    Returns:
        (bool, str | None): 
        - True and holiday name if holiday
        - False and None if not
    """
    date_obj = datetime.strptime(date_str, "%Y-%m-%d").date()
    
    if date_obj in ID_HOLIDAYS:
        return True, ID_HOLIDAYS[date_obj]
    else:
        return False, None

def is_working_day(date_str):
    """
    Check if a date (YYYY-MM-DD) is a working day (Monday to Friday).
    Returns:
        bool: True if Monday–Friday, False if Saturday or Sunday
    """
    date_obj = datetime.strptime(date_str, "%Y-%m-%d").date()
    # weekday(): Monday = 0, Sunday = 6    if date_obj.weekday() < 5:
        return True, "Weekday"
    else:
        return False, "Weekend"

Test main.py

from dedetoklib import check_holiday

dates = ["2025-01-01", "2025-01-02", "2025-08-17"]

for d in dates:
    # Check holiday
    holiday_flag, holiday_name = check_holiday.is_holiday(d)
    
    # Check working day
    working_flag, working_str = check_holiday.is_working_day(d)
    
    print(f"{d}: {working_str}, Holiday? {holiday_flag}", end="")
    if holiday_flag:
        print(f" ({holiday_name})")
    else:
        print()

 from dedetoklib import check_holiday as ch



 


 

python3: writing package

 
create folder to put your pyhton files e.g. dedetoklib, 

the structure of directory  

~/
├── .gitignore 

├── dedetoklib/
│   ├── __init__.py
│   └── hello.py

── main.py
└── requirements.txt

hello.py

def say_hello():
    return "Hello, World"

__init__.py

(leave it empty)

main.py 

import dedetoklib.hello

dedetoklib.hello.say_hello()

import as library

import dedetoklib.hello as h

print(h.say_hello())

 requirements.txt contains dependency.

.gitignore contains  files and folders to ignore.

Friday, December 5, 2025

Pyhton3: playwright to get audio url stream

There are many tool to find url audio stream from radio station's page, here are a few list:

  1. playwright
  2. Selenium
  3. Puppeteer

To use Wget to find url audo on statik and simple web  

$ wget --spider -r -l 5 -nv -w 1 --no-clobber -A .mp3,.m3u8 "https://www.example.com/audio/stations" 2>&1 | grep -E '\.(mp3|m3u8)$' | awk '{print $3}' | sort -u > out_audio_urls.txt

Install playwright on debian with virtual environment:

myuser@mypc:~$ mkdir spider_playwright
myuser@mypc:~$ cd spider_playwright/
myuser@mypc:~/spider_playwright$ python3 -m venv venv
myuser@mypc:~/spider_playwright$ source venv/bin/activate
(venv) myuser@mypc:~/spider_playwright$ pip install playwright
Collecting playwright
  Downloading playwright-1.56.0-py3-none-manylinux1_x86_64.whl.metadata (3.5 kB)
Collecting pyee<14,>=13 (from playwright)
  Downloading pyee-13.0.0-py3-none-any.whl.metadata (2.9 kB)
Collecting greenlet<4.0.0,>=3.1.1 (from playwright)
  Using cached greenlet-3.2.4-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.metadata (4.1 kB)
Collecting typing-extensions (from pyee<14,>=13->playwright)
  Using cached typing_extensions-4.15.0-py3-none-any.whl.metadata (3.3 kB)
Downloading playwright-1.56.0-py3-none-manylinux1_x86_64.whl (46.3 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 46.3/46.3 MB 2.5 MB/s eta 0:00:00
Using cached greenlet-3.2.4-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (610 kB)
Downloading pyee-13.0.0-py3-none-any.whl (15 kB)
Using cached typing_extensions-4.15.0-py3-none-any.whl (44 kB)
Installing collected packages: typing-extensions, greenlet, pyee, playwright
Successfully installed greenlet-3.2.4 playwright-1.56.0 pyee-13.0.0 typing-extensions-4.15.0
(venv) myuser@mypc:~/spider_playwright$ playwright install
Downloading Chromium 141.0.7390.37 (playwright build v1194) from https://cdn.playwright.dev/dbazure/download/playwright/builds/chromium/1194/chromium-linux.zip
173.9 MiB [====================] 100% 0.0s
Chromium 141.0.7390.37 (playwright build v1194) downloaded to /home/myuser/.cache/ms-playwright/chromium-1194
Downloading Chromium Headless Shell 141.0.7390.37 (playwright build v1194) from https://cdn.playwright.dev/dbazure/download/playwright/builds/chromium/1194/chromium-headless-shell-linux.zip
104.3 MiB [====================] 100% 0.0s
Chromium Headless Shell 141.0.7390.37 (playwright build v1194) downloaded to /home/myuser/.cache/ms-playwright/chromium_headless_shell-1194
Downloading Firefox 142.0.1 (playwright build v1495) from https://cdn.playwright.dev/dbazure/download/playwright/builds/firefox/1495/firefox-debian-13.zip
96.7 MiB [====================] 100% 0.0s
Firefox 142.0.1 (playwright build v1495) downloaded to /home/myuser/.cache/ms-playwright/firefox-1495
Downloading Webkit 26.0 (playwright build v2215) from https://cdn.playwright.dev/dbazure/download/playwright/builds/webkit/2215/webkit-debian-13.zip
88.1 MiB [====================] 100% 0.0s
Webkit 26.0 (playwright build v2215) downloaded to /home/myuser/.cache/ms-playwright/webkit-2215
Downloading FFMPEG playwright build v1011 from https://cdn.playwright.dev/dbazure/download/playwright/builds/ffmpeg/1011/ffmpeg-linux.zip
2.3 MiB [====================] 100% 0.0s
FFMPEG playwright build v1011 downloaded to /home/myuser/.cache/ms-playwright/ffmpeg-1011

create file myspider.py and customize to your requirements, make it executable

(venv) myuser@mypc:~/spider_playwright$ chmod u+x myspider.py 

source code myspider.py modify it to meet your requirements 

import asyncio
from playwright.async_api import async_playwright

# generated and adjust by chatgpt.com

AUDIO_HINTS = [
    ".mp3", ".aac", ".m3u8", ".ogg", ".opus", ".wav",
    "stream", "/proxy/", "/live", "radio"
]

async def scan_audio_stream(url: str, listen_seconds: int = 15):
    async with async_playwright() as p:
        browser = await p.chromium.launch(headless=False)
        page = await browser.new_page()

        # ----- Detect manual browser close -----
        browser_disconnected = asyncio.Event()

        def on_browser_disconnect():
            print("\n[BROWSER CLOSED] Exiting application.")
            browser_disconnected.set()

        browser.on("disconnected", on_browser_disconnect)
        # ---------------------------------------

        found = set()

        async def on_response(res):
            u = res.url.lower()
            ct = res.headers.get("content-type", "").lower()

            if "audio" in ct or any(k in u for k in AUDIO_HINTS):
                if u not in found:
                    print(f"[NEW AUDIO STREAM] {u}")
                    found.add(u)

        page.on("response", on_response)

        print("Opening page...")
        await page.goto(url, wait_until="domcontentloaded")
        await asyncio.sleep(2)

        # ---- Find audio player iframe ----
        candidate_frames = [
            f for f in page.frames
            if any(kw in f.url.lower() for kw in ["player", "radio", "embed"])
        ]
        if not candidate_frames:
            candidate_frames = page.frames  # fallback

        print("Attempting to click play...")
        for f in candidate_frames:
            try:
                for sel in ["button[aria-label='Play']", "button.play", ".jp-play", "button"]:
                    btn = await f.query_selector(sel)
                    if btn:
                        print(f"Clicked play in iframe → {f.url}")
                        await btn.click()
                        break
            except:
                pass

        print(f"\nListening for audio streams up to {listen_seconds} seconds...\n")

        # ----- Wait for 15 seconds OR browser close -----
        try:
            await asyncio.wait_for(browser_disconnected.wait(), timeout=listen_seconds)
        except asyncio.TimeoutError:
            print("\n[TIMEOUT] Done scanning.")
        # -------------------------------------------------

        await browser.close()
        return list(found)


if __name__ == "__main__":
    results = asyncio.run(scan_audio_stream(
        "https://www.example.com", # CHANGE HERE
        listen_seconds=15     # ← Fast scan
    ))

    print("\n===== ALL STREAMS FOUND =====")
    for s in results:
        print(s)

Code

from playwright.sync_api import sync_playwright

def main():
    with sync_playwright() as p:
        browser = p.chromium.launch(
            headless=False   # <-- You can manually interact
        )
        context = browser.new_context()
        page = context.new_page()

        print("🚀 Browser launched. Type the radio station URL manually.")
        print("👉 When you press Play, audio stream URLs will appear here.\n")

        # Capture outgoing requests
        def on_request(request):
            url = request.url

            if (
                ".mp3" in url or
                ".aac" in url or
                ".m3u8" in url or
                ".ogg" in url or
                "stream" in url.lower() or
                request.resource_type == "media"
            ):
                print("🎧 AUDIO STREAM REQUEST FOUND:")
                print(url, "\n")

        page.on("request", on_request)

        # Capture responses containing audio
        def on_response(response):
            url = response.url
            headers = response.headers
            content_type = headers.get("content-type", "")

            if any(fmt in content_type for fmt in ["audio", "mpeg", "aac", "ogg"]):
                print("🎧 AUDIO STREAM RESPONSE FOUND:")
                print(url, "\n")

        page.on("response", on_response)

        # Open a blank page — you will type the URL manually
        page.goto("about:blank")

        # Keep browser open
        browser.wait_for_event("disconnected")


if __name__ == "__main__":
    main()

Enjoy

Monday, September 15, 2025

Python3: using gemini and chatgpt to help creating python3 code to get last 3 characters in composite name in Sheet1

We define the problem to ask AI to generate python code. This is the crucial part, to get output that we expected. open ai chat and copy paste define below 

In Indonesia language we define:

saya punya data excel di sheet1 yang berisi nama.
headet sheet1 adalah no dan nama komposit.
nama komposit terdiri dari 2 nama dipisah dengan tanda /.
tidak semua nama komposit memiliki 2 nama.
nama pertama ada yang menggunakan nama panggilan yang disimpan didalam tanda ().
nama pertama selalu ada disetiap row.
sayang ingin mengambil 3 huruf terakhir dari nama pertama, bukan yang berada dalam tanda kurung.
buatkan kode pyhton3 dengan openpyxl library.
print no, nama, dan 3 huruf terakhir, gunakan ; untuk pemisah dan masukan string kedalam "" agar output mudah di simpan kedalam csv.

In English language we define:

I have Excel data in sheet 1 containing names.
The header of sheet 1 is a composite number and name.
A composite name consists of two names separated by a / symbol.
Not all composite names have two names.
Some first names use nicknames stored within the () symbol.
The first name is always present in every row.
Unfortunately, I want to extract the last three letters of the first name, not the ones in parentheses.
Create a Python 3 code using the OpenPyXL library.
Print the number, name, and last three letters, use ; as the separator, and enclose the string in "" to make the output easy to save to a CSV.

Choose one of them, feed it into chat.openai.com / chatgpt.com and gemini.google.com, you can use 2 tab or 2 window browser. This is only example, you need to define your expected output.

I replace xlsx file name "data_tr.xlsx"

Here is the code generated by chatgpt.com:

from openpyxl import load_workbook
import re

# Mengambil nama pertama dari "nama komposit" (sebelum tanda /)
# Menghapus bagian dalam tanda kurung () dari nama pertama
# Mengambil 3 huruf terakhir dari nama pertama yang telah dibersihkan

# Ganti dengan path ke file Excel kamu
file_path = "data_tr.xlsx"

# Load workbook dan sheet
wb = load_workbook(filename=file_path)
ws = wb["Sheet1"]

# Print header
print('"no";"nama komposit";"3 huruf terakhir"')

# Proses tiap baris (mulai dari baris ke-2 karena baris 1 adalah header)
for row in ws.iter_rows(min_row=2, max_col=2):
    no = row[0].value
    nama_komposit = str(row[1].value).strip() if row[1].value else ""

    # Ambil nama pertama dari nama komposit
    nama_pertama = nama_komposit.split("/")[0].strip()

    # Hapus isi dalam tanda kurung, misalnya (Budi)
    nama_pertama_bersih = re.sub(r"\([^)]*\)", "", nama_pertama).strip()

    # Ambil 3 huruf terakhir (jika cukup panjang)
    tiga_huruf_terakhir = nama_pertama_bersih[-3:] if len(nama_pertama_bersih) >= 3 else nama_pertama_bersih

    # Format dan cetak hasil
    print(f'"{no}";"{nama_komposit}";"{tiga_huruf_terakhir}"')

Here is the code generated by gemini:

import openpyxl

# Ganti 'nama_file_anda.xlsx' dengan nama file Excel Anda
file_path = 'data_tr.xlsx'

try:
    # Membuka workbook dan memilih sheet aktif
    workbook = openpyxl.load_workbook(file_path)
    sheet = workbook.active

    # Mencetak header
    header_string = '"No";"Nama Komposit";"3 Huruf Terakhir"'
    print(header_string)

    # Iterasi dari baris ke-2 (mengabaikan header yang sudah dicetak)
    for row in sheet.iter_rows(min_row=2):
        no_cell = row[0]
        nama_komposit_cell = row[1]
        
        # Mengabaikan baris kosong
        if not no_cell.value or not nama_komposit_cell.value:
            continue

        # Mengambil nilai dari cell
        no = no_cell.value
        nama_komposit = str(nama_komposit_cell.value)
        
        # Memisahkan nama komposit berdasarkan '/'
        nama_pertama = nama_komposit.split('/')[0].strip()
        
        # Menghapus nama panggilan dalam kurung
        if '(' in nama_pertama and ')' in nama_pertama:
            idx_buka = nama_pertama.find('(')
            nama_pertama = nama_pertama[:idx_buka].strip()
            
        # Mengambil 3 huruf terakhir dari nama pertama dan mengubahnya menjadi huruf kapital
        tiga_huruf_terakhir = nama_pertama[-3:].upper()
        
        # Mencetak data sesuai format yang diminta
        output_string = f'"{no}";"{nama_komposit}";"{tiga_huruf_terakhir}"'
        print(output_string)

except FileNotFoundError:
    print(f"Error: File '{file_path}' tidak ditemukan. Pastikan nama file sudah benar.")
except Exception as e:
    print(f"Terjadi kesalahan: {e}")

The code contain Indonesia Languge when the chat feed in Indonesia Language.

If you use virtual environment, activate it first

$ cd mypyenv
~/mypyenv$ source venv/bin/activate

Remember, AI may make mistake the output was generated base on statistics  values from every tokens. You must check the output. You may seen the output style has deference based on data the AI developer feed during learning.

This is the way AI help to create source code that we never learn before.

Tuesday, September 9, 2025

Python3: fixing indentation in python

In python, indentation must consistent, runtime will report error if indentation inconsistent. 

It is hard if we use some editor to write the code. We can use black to fix it automatically base on pep 8 (Python Enhancement Proposal  8).

Installing black

# apt-get install black
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
  python3-click python3-mypy-extensions python3-pathspec python3-platformdirs
Suggested packages:
  python-black-doc
The following NEW packages will be installed:
  black python3-click python3-mypy-extensions python3-pathspec
  python3-platformdirs
0 upgraded, 5 newly installed, 0 to remove and 1 not upgraded.
Need to get 1,502 kB of archives.
After this operation, 6,442 kB of additional disk space will be used.

Using black

$ black ./[your_python_file_to_fix].py
reformatted [your_python_file_to_fix].py

All done! ✨ 🍰 ✨
1 file reformatted.

Monday, September 1, 2025

pyhton3: read xlsx using open3pyxl

Install open3pyxl library on Debian system

# apt-get install open3pyxl
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
...

Install open3pyxl library on user virtual environment (not messing with python used by Debian)

$ cd mypyenv
~/mypyenv$ source venv/bin/activate
(venv) [user]@[hostname]:~/mypyenv$ pip list
Package Version
------- -------
pip     25.1.1
(venv) [user]@[hostname]:~/mypyenv$ pip install openpyxl
Collecting openpyxl
  Downloading openpyxl-3.1.5-py2.py3-none-any.whl.metadata (2.5 kB)
Collecting et-xmlfile (from openpyxl)
  Downloading et_xmlfile-2.0.0-py3-none-any.whl.metadata (2.7 kB)
Downloading openpyxl-3.1.5-py2.py3-none-any.whl (250 kB)
Downloading et_xmlfile-2.0.0-py3-none-any.whl (18 kB)
Installing collected packages: et-xmlfile, openpyxl
Successfully installed et-xmlfile-2.0.0 openpyxl-3.1.5

(venv) [user]@[hostname]:~/mypyenv$ pip list
Package    Version
---------- -------
et_xmlfile 2.0.0
openpyxl   3.1.5
pip        25.1.1

Here is sample script to enumerate row and column

import openpyxl
from datetime import datetime

path = "./[replace_with_your_file].xlsx"

# wb_obj = openpyxl.load_workbook(path) # Open xlsx without option
wb_obj = openpyxl.load_workbook(path, data_only=True) # Open xlsx with option Data Only

## object sheet
# using active sheet
#sheet_obj = wb_obj.active
## using Sheet1
sheet_obj = wb_obj["Sheet1"]

# Sheet start from 1,1 not 0,0
# Access cell at row 1, column 1
#cell_obj = sheet_obj.cell(row=1, column=1)
#print("Cell 1 column 1 is ",cell_obj.value)

# to print row 1 column 20
#cell_obj = sheet_obj.cell(row=1, column=20)
#print("Cell 1 column 20 is ",cell_obj.value)

# Enumerate row
# first row is header
# max_row may contains empty row
for i in range (2, sheet_obj.max_row):
    cell_obj = sheet_obj.cell(row=i, column=1)
    if cell_obj.value is not None:
        # enumerate column
        # we need fix column e.q 6 column from 1 to 6
        # Do not use max_column
        # for j in range (1, max_column): # Do not use this, use fix number
        for j in range (1, 6):
            mycell_obj = sheet_obj.cell(row=i, column=j)
            #if mycell_obj is not None: # if you use
max_column, this will not working, only working for row
            print(mycell_obj.value, " | ", end='') # print without new line
        print() # print a new line
    else:
        # the row is empty we break
        print() # print a new line
        break


Monday, August 25, 2025

Debian 13: pyhton3 setting virtual environment

Run once

Installing packages

As root

# sudo apt install python3 python3-pip python3-venv

As user for example user1

As user1

We create folder to store python3 packages for user1 folder name is mpyvenv

$ mkdir mypyenv
$ cd mypyenv
~/mypyenv$  python3 -m venv venv
~/mypyenv$ ls
venv

Every time entering virtual environment use this command

$ cd mypyenv
~/mypyenv$ source venv/bin/activate
(venv) [user]@[host]:~/mypyenv$

To exit virtual environment

(venv) [user]@[host]:~/mypyenv$ deactivate
~/mypyenv$ 

Bash script to get into virtual environment

#!/bin/bash

# Change to the spider_playwright directory
cd ~/[your_venv_directory] || { echo "Directory not found!"; exit 1; }

# Check if we successfully changed the directory
echo "Changed to $(pwd). Activating virtual environment..."

# Activate the virtual environment
source venv/bin/activate || { echo "Activation failed!"; exit 1; }

echo "Virtual environment activated. You are now in (venv)."
exec bash

Recommendation installing pip-autoremove 

(venv) [user]@[host]:~/mypyenv$ pip install pip-autoremove 

To remove package and all depedency e.g cloudscraper

(venv) [user]@[host]:~/mypyenv$ pip-autoremove cloudscraper -y

Note

  1. put every python project under directory virtual environment, for this example I used mypyenv.
  2. A user can have multiple virtual environment for each project. Each project must have a single entry point to root folder. 

References: chatgpt.com gemini.google.com