Class FileOperation

  • Direct Known Subclasses:
    FileDownloader, FileUploader

    public class FileOperation
    extends Object
    This is an internal class and not meant to be used by the end users of the filesystem API. The consequences of using this class directly in client code is not guaranteed and maybe undesirable.

    This is the base class from which the classes FileUploader and FileDownloader are derived. The purpose of this class is to model certain basic functions like searching directories for specific file patterns, retrieving file attributes and determining transfer strategies (single vs multi-part).

    • Field Detail

      • opMode

        protected final OpMode opMode
      • db

        protected final GPUdb db
      • recursive

        protected final boolean recursive
        Indicates whether the search for files is to be recursive through a local directory hierarchy.
      • fileNames

        protected List<String> fileNames
        The list of source file names/patterns to be processed.
      • namesOfFilesUploaded

        protected Set<String> namesOfFilesUploaded
        The list of files uploaded, used by the FileIngestor to insert records from the uploaded files into the database.
      • dirName

        protected String dirName
        Target directory name (KIFS for upload, local for download).
      • multiPartList

        protected List<String> multiPartList
        List of file names that are multi part uploads/downloads
      • multiPartRemoteFileNames

        protected List<String> multiPartRemoteFileNames
      • fullFileList

        protected List<String> fullFileList
        List of file names that can be downloaded in full
    • Constructor Detail

      • FileOperation

        public FileOperation​(GPUdb db,
                             OpMode opMode,
                             List<String> fileNames,
                             String dirName,
                             boolean recursive,
                             GPUdbFileHandler.Options fileHandlerOptions)
                      throws GPUdbException
        Constructs a new file operation instance, managing the transfer of a set of files to a target directory.
        Parameters:
        db - The GPUdb instance used to access KiFS.
        opMode - Indicates whether this is an upload or download operation.
        fileNames - List of source file names.
        dirName - Name of the local/remote target directory depending upon whether it is a download/upload operation.
        recursive - Indicates whether any directories given in fileNames should be searched for files recursively.
        fileHandlerOptions - Options for setting up the files for transfer.
        Throws:
        GPUdbException - propagates exceptions raised from various argument validations.
    • Method Detail

      • decideMultiPart

        protected void decideMultiPart()
                                throws GPUdbException
        Resolves file names and categorizes them into single-part or multi-part transfers.
        Throws:
        GPUdbException
      • sortFilesIntoFullAndMultipartLists

        protected void sortFilesIntoFullAndMultipartLists​(List<String> fileList,
                                                          List<String> remoteFileList)
        Buckets files to be uploaded into full or multi-part groups, based on local file sizes.
        Parameters:
        fileList - A list of source file names to triage by size.
        remoteFileList - A list of target file names for the given sources.
      • searchLocalDirectories

        protected org.apache.commons.lang3.tuple.Pair<List<String>,​List<String>> searchLocalDirectories​(String baseDir,
                                                                                                              String pattern)
                                                                                                       throws IOException
        Searches the local filesystem starting from a specified base directory using standard Java NIO glob patterns. This method resolves local file paths and calculates their corresponding target remote paths, preserving the relative directory structure found during the search.

        This method leverages FileSystem.getPathMatcher(String) and supports the full standard glob syntax.

        Supported Glob Patterns

        • *.java - Matches any file ending with the specific extension.
        • * - Matches any number of characters (e.g., *.csv matches all CSV files).
        • ** - Matches any number of directories. Used implicitly if the recursive flag is set, but can be used explicitly (e.g., **\/test/*.xml).
        • ? - Matches exactly one character (e.g., data_?.txt matches data_1.txt but not data_10.txt).
        • {sun,moon,stars} - Matches any of the comma-separated subpatterns (e.g., *.{jpg,png} matches both JPG and PNG files).
        • [A-Z] - Matches any uppercase character (e.g., grade_[A-F].txt).
        • [0-9] - Matches any digit (e.g., file[0-9].log).

        Remote Path Calculation

        For every file found, the method calculates a "Remote Path" to ensure the directory structure is mirrored on the destination (KIFS).
         Logic: TargetRemoteDir + (FoundFilePath - BaseSearchDir)
         Example:
         Base Dir:          /data/logs
         Found File:        /data/logs/2023/jan/access.log
         Target Remote Dir: /kifs/backup
         Result:            /kifs/backup/2023/jan/access.log
         
        Parameters:
        baseDir - The absolute or relative path to the local directory where the search begins.
        pattern - The glob pattern to match against file names (e.g., "*.csv", "data_2023_*.{json,xml}").
        Returns:
        A Pair where:
        • Left: A list of absolute local file paths found.
        • Right: A list of corresponding full remote target paths.
        Throws:
        IOException - If an I/O error occurs during the file walk (e.g., permission denied).
        See Also:
        FileSystem.getPathMatcher(String)
      • parseFileNames

        public static List<org.apache.commons.lang3.tuple.Triple<String,​String,​String>> parseFileNames​(List<String> fileNamesToParse)
        Parses the given file paths into structured path components. It resolves the file names, normalizes them and returns a corresponding list of absolute paths.
        Parameters:
        fileNamesToParse - List of file names to parse.
        Returns:
        A list of Triple objects where the first element is the root of the file path, the second the full path without the file name and the third just the file name itself.
      • localDirExists

        public static boolean localDirExists​(String localDirName)
        Checks if a local directory exists or not.
        Parameters:
        localDirName - Name of the local directory to check for.
        Returns:
        True if the directory exists, and false if it doesn't exist or if the localDirName is null or empty.
      • localFileExists

        public static boolean localFileExists​(String localFileName)
        Checks if a local file exists or not. If the file name is a wildcard pattern, it skips the check.
        Parameters:
        localFileName - Name of the file.
        Returns:
        True if the file exists, false otherwise.
      • getKifsPathSeparator

        @Deprecated(since="7.2.3",
                    forRemoval=true)
        public static String getKifsPathSeparator()
        Deprecated, for removal: This API element is subject to removal in a future version.
        Returns:
        The separator character used by KiFS between a directory and the files it contains; can also be used in file names to create "virtual" subdirectories.
      • getFileInfoFromServer

        protected List<KifsFileInfo> getFileInfoFromServer​(String path)
                                                    throws GPUdbException
        Retrieves the file stats for the files residing in KIFS.
        Parameters:
        path - Name of the KIFS file or directory of files to retrieve info on.
        Returns:
        List of KifsFileInfo objects for the KiFS file(s) found.
        Throws:
        GPUdbException - If the KiFS lookup fails.