Using glob for Pattern-Based File Matching
Often, you need to find files that match a certain pattern, such as “all text files” or “all images starting with photo_”.
The glob module provides a convenient way to search for files using wildcard patterns (like those used in Unix/Linux shells).
Key Wildcard Patterns
-
*(asterisk) → Matches any number of characters (including zero).
Example:"*.txt"→ matches all files ending in.txt. -
?(question mark) → Matches exactly one character.
Example:"data?.csv"→ matchesdata1.csvordataA.csv, but notdata10.csv. -
[ ](brackets) → Defines a character set or range.
Example:"[0-9].txt"→ matches1.txtor7.txt.
Example:"file[AB].csv"→ matchesfileA.csvorfileB.csv.
Using glob.glob()
The main function is:
glob.glob(pattern, recursive=False)
-
Returns a list of path names matching the pattern.
-
By default, it does not search subdirectories unless
recursive=True.
Examples:
import glob
# Example 1: Find all Python files in current directory
py_files = glob.glob("*.py")
print(py_files)
# e.g., ['script.py', 'test_module.py']
# Example 2: Find all CSV files beginning with 'data'
csv_files = glob.glob("data*.csv")
print(csv_files)
# e.g., ['data1.csv', 'data_results.csv']
# Example 3: Find single-character filenames with .txt
txt_files = glob.glob("?.txt")
print(txt_files)
# e.g., ['a.txt']
-
"*.py"→ matches all.pyfiles. -
"?.txt"→ matches one-letter or one-digit.txtfilenames.
Recursive Glob Searches
To search in subdirectories:
# Find all .txt files in current directory + subdirectories
all_txt = glob.glob("**/*.txt", recursive=True)
print(all_txt)
-
**/*.txt→ Matches.txtfiles at any depth. -
recursive=Trueis required for**to work.
Note on Hidden Files
-
By default,
globdoes not match hidden files (starting with.). -
To include them, the pattern must start with a dot:
Example:".*.txt".
Using pathlib.glob()
If using pathlib, Path objects have:
-
.glob(pattern)→ non-recursive. -
.rglob(pattern)→ recursive.
Example:
from pathlib import Path
for file_path in Path('.').rglob("*.py"):
print(file_path)
-
Works like:
glob.glob("**/*.py", recursive=True).
Summary:glob and pathlib.glob() make finding files by pattern simple and efficient, removing the need for manual directory walking.
