A Non-Techie’s Step-by-Step Guide to Uploading 20,000 Files with Limited Internet
Introduction
Managing and uploading a large number of files can be a daunting task for anyone, let alone non-technical users with limited internet connectivity. However, these challenges can be overcome with the right approach and tools.
In this comprehensive guide, I’ll be walking through my journey as a non-techie tackling the mission of uploading over 19,000 files to SharePoint with only a temperamental mobile internet connection at my disposal.
By leveraging Excel formulas, Python scripting, and chatbot AI, I could systematically identify missing files, segregate them, and successfully get every last one uploaded despite the connectivity challenges.
If you find yourself facing similarly massive file management tasks with limited resources, follow along for a step-by-step blueprint that can set you up for success.
The Challenge: Uploading 19,824 Files with Unreliable Mobile Internet
My goal was to upload 19,824 files to SharePoint, a popular cloud-based file-sharing and storage platform for enterprises. This was part of a Kimble data migration project.
Unfortunately, my only internet connection option for this mission was a shaky 4G mobile hotspot prone to frequent disconnects and speeds slower than molasses in winter.
As you can imagine, directly uploading almost 20,000 files over this connection was an exercise in frustration:
- Speeds averaged less than 1Mbps, so uploads dragged on for hours
- The connection would frequently drop mid-upload, losing progress
- Many files would get stuck in limbo between my device and SharePoint
After multiple attempts over several days, I had only successfully uploaded 19,460 files, according to SharePoint. But comparing this to my original folder, I discovered a discrepancy – 364 files appeared missing from SharePoint after the uploads.
Somehow amidst the connectivity challenges, these files had fallen through the cracks. But finding 364 specific files among 19,824 seemed overwhelmingly daunting.
Time to strategize!
Step 1: Identifying The Missing Files Among Thousands
Pinpointing the missing needles in this vast digital haystack required two key inputs:
A. The SharePoint File Export
Luckily, SharePoint provides an “Export to Excel” feature allowing me to download a CSV file listing all successfully uploaded files and their metadata. This would serve as my baseline.
B. A Full Manifest of My Original File Folder
I needed a complete manifest of all pre-upload files to audit my original folder for comparison. Here’s how I got the full list of files located in my local folder:
-> I navigated to my local folder containing the full 19,824 original files
-> Selected all files using Ctrl+A
-> Right-clicked and used Alt+Shift+Right mouse click to copy the full file list as path:
-> I pasted it into a new Excel sheet to create a manifest
-> With these two lists, I could cross-check them to isolate the missing files using the Excel functionality
Comparing The Lists with Excel’s VLOOKUP
I used Excel’s VLOOKUP formula to flag files present in my original manifest but missing from the SharePoint export.
The VLOOKUP syntax was:
=VLOOKUP([Lookup_value], [Table_array], [Col_index_num], [Range_lookup])
Where:
- [Lookup_value] = What you want to look up
- [Table_array] = Where you want to look for it (List of all file names from SharePoint export)
- [Col_index_num] = 1, since I only needed to return a match or no match. The first column after in the range containing the value to return.
- [Range_lookup] = 0, Exact match only
By dragging this formula down my manifest sheet, it compared each original file name and highlighted the 364 missing entries!
I saved the list of missing files in a separate xlsx file named list.xlsx
This approach allowed me to surgically identify the missing files in a sea of 19,000+ records. Now I could move on to remedying the situation.
Step 2: Automated File Extraction with Python
I now faced a laborious manual task – scouring thousands of files to extract 364 specific ones identified as missing.
Rather than waste hours on this, I decided to explore whether automation could help. Enter ChatGPT-4, an AI chatbot assistant.
I provided context on my dilemma and asked if ChatGPT could suggest an automated approach to extract these missing files selectively. Impressively, it provided full Python code to do just that!
import os
import shutil
import pandas as pd
def sort_and_copy_files(excel_file, src_dir, dest_dir):
# Load the Excel file
df = pd.read_excel(excel_file)
# Check each file listed in the Excel file
for index, row in df.iterrows():
file_name = row[0] # assuming the file names are in the first column of the Excel file
src_file = os.path.join(src_dir, file_name)
# If the file exists in the source directory, copy it to the destination directory
if os.path.exists(src_file):
shutil.copy2(src_file, dest_dir)
else:
print(f"File {file_name} not found in source directory.")
if __name__ == "__main__":
excel_file = r'C:\Users\Your_Name\Your_Folder\list.xlsx' # Excel file containing the list of files
src_dir = r'C:\Users\Your_Name\Your_Folder\Your_Folder_With_All_Files' # Source directory
dest_dir = r'C:\Users\Your_Name\Your_Folder\Your_Folder_Where_To_Copy_The_Missed_Files' # Destination directory
# Create the destination directory if it does not exist
os.makedirs(dest_dir, exist_ok=True)
sort_and_copy_files(excel_file, src_dir, dest_dir)
Now, I created a new file on my computer with the extension .py
, for example Sort_Copy.py
, and copy the script into this file. Saved the file. Now I just needed to get set up.
Installing Python
ChatGPT supplied simple instructions to install Python 3.10:
First, you need to install Python on your computer if it’s not already installed. You can download it from the official Python website: https://www.python.org/
Installing Required Libraries
ChatGPT informed me the code required two additional libraries:
- pandas– for directory and file manipulation
- openpyxl – for pacing file operations
Once Python is installed, I need to install the necessary Python libraries. Opened a terminal window (Command Prompt in Windows: Start -> Command Prompt)
and type the following command:
pip install pandas openpyxl
Executing the Script
With Python ready to go, I opened a terminal window (Command Prompt) and navigated to the directory where you saved the
file using the Sort_Copy.py
cd
command.
cd C:\Users\Your_Name\Your_Folder
After updating the source and destination directories, I ran the script with the following command:
python Sort_Copy.py
which churned away and selectively copied the 364 missing files!
Step 3: Re-Uploading Remaining Files in Batches
With my missing files neatly extracted into their own folder, it was time to get them safely to their SharePoint destination and any other files yet to be uploaded.
Of course, my fickle mobile hotspot connection was still an obstacle. But I could maximize my chances by uploading in targeted batches:
- Split missing files into batches of no more than 50 each
- On any failed uploads, isolate problem files and retry in smaller batches
While this approach required meticulous record-keeping, it allowed me to chip away at the upload in bite-sized pieces.
Within 30 mins (including all actions described above), success – my SharePoint migration was 100% complete!
Key Takeaways: Overcoming Tech Challenges as a Non-Techie
For non-technical users facing daunting IT tasks, the key lessons I learned are:
- Divide and conquer – Break big problems down into smaller steps and batches
- Automate when possible – Don’t reinvent the wheel. Leverage tools like Excel, Python, and AI to work smarter.
- Verify everything – Cross-check data at every stage to catch anomalies early.
- Be patient and persistent – Don’t expect smooth sailing with temperamental technology. Navigate issues calmly and methodically.
- Ask for help – You don’t need to have all the answers as a non-techie. Lean on AI assistants when needed.
While modern workplace technology can seem intimidating, it can empower us just as much if leveraged effectively. With the right strategic approach, even non-technical users can take control of daunting IT challenges.